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CHAPTER I 


Introduction 


The mechanics of wave propagation in the presence of obstacles is of great inter- 
est in many branches of engineering and applied mathematics like electromagnetics, 
fluid dynamics, geophysics, seismology, etc. Such problems can be broadly classi- 
fied into two categories: the bounded domain or the closed problem and the un- 
bounded domain or the open problem. Analytical techniques have been derived for 
the simpler problems; however, the need to model complicated geometrical features, 
complex material coatings and fillings and to adapt the model to changing design 
parameters have inevitably tilted the balance in favor of numerical techniques. The 
modeling of closed problems presents difficulties primarily in proper meshing of the 
interior region. However, problems in unbounded domains pose a unique challenge 
to computation, since the exterior region is inappropriate for direct implementation 
of numerical techniques. A large number of solutions have been proposed but only 
a few have stood the test of time and experiment. 

The goal of this thesis is to develop an efficient and reliable partial differential 
equation technique to model large three dimensional scattering problems in electro- 
magnetics. 
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1.1 Background 

Ever since the method of moments (MoM) was introduced by Harrington [4] in 
the late 60s, numerical techniques for predicting electromagnetic field behavior have 
gained in popularity. With increases in computing speeds and memory and the need 
to simulate real-life problems, researchers have been actively trying to refine the older 
numerical methods as well as devise newer and more efficient solution techniques. 
The MoM is based on applying integral equations on the surface of the desired 
structure and computing the fields everywhere in space [5]. For anisotropic materials, 
the entire volume needs to be discretized and a volume integral equation must then 
be solved. However, the matrix obtained from discretizing an integral equation is 
full and thus requires 0(N 2 ) storage, where N is the number of unknowns. In 3D 
problems, this is a serious limitation since the method can scale up to large problems 
only at considerable computational cost. Thus we need to seek solution techniques 
which scale favorably with increasing problem size. 

Partial differential equation (PDE) methods, like finite element and finite differ- 
ence methods, offer the most attractive alternative to integral equation techniques 
since they lead to matrix systems which are sparse. Therefore, only the non-zero 
entries of the final matrix system need to be stored resulting in O(N) storage re- 
quirement. Thus the increase in storage demand with increasing problem size is seen 
to be minimal. 

PDE techniques like finite elements were, however, originally constructed for 
solving bounded domain problems. In recent times, finite elements are increasingly 
being used for modeling unbounded problems, where the desired parameter decays 
off to zero infinitely away from the region of interest. In electromagnetics, the desired 
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parameter is a field quantity like the electric or magnetic field. It is obviously imprac- 
tical to extend the finite element mesh to infinity; thus the mesh must be truncated 
at a suitable distance from the region of interest. Boundary conditions should then 
be applied at this artificial mesh truncation surface such that the boundary appears 
transparent to the propagating field. 

There are two types of mesh truncation conditions: exact and approximate. Exact 
boundary conditions can be placed very close to the region of interest; however, they 
suffer from potential uniqueness problems [6] and give rise to partly full systems. 
The loss of uniqueness associated with systems where the exact boundary condition 
is employed on the mesh truncation boundary is well-known and was first pointed out 
by Mautz and Harrington[7]. Remedies like complexification of the wave number [8] 
and using the combined field integral equation [9] exist, and must be used for a robust 
implementation. Although the problem of interior resonances can be now avoided, 
the finite element-boundary integral system still possesses a partly full system which 
affects its scalability to large problems. 

Approximate boundary conditions, on the other hand, are local in nature but 
preserve the sparsity of the finite element system. This advantage is partly offset 
by the fact that the finite element mesh must be extended some distance away from 
the region of interest and the approximate boundary condition imposed on the mesh 
truncation surface. These boundary conditions work on the principle that the higher 
order terms of the expansion for the propagating field decay rapidly away from the 
target. Therefore, if the truncation boundary is placed far enough from the region 
of interest, the boundary condition on the mesh termination surface needs to absorb 
only the lowest order terms of the field expansion to accurately model the physics 
of the problem. In this thesis, our aim is to examine the performance of these 
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approximate conditions, also known as absorbing boundary conditions (ABCs), in 
practical three dimensional problems and to derive improved boundary conditions 
which will enable more efficient utilization of the available resources. 

1.2 Outline of thesis 

This dissertation describes the development of a finite element method for the 
solution of general three-dimensional scattering problems. The entire research has 
been geared towards a robust, state-of-the-art solution of unbounded domain prob- 
lems in electromagnetics. Improvements in solution convergence, mesh termination 
conditions, algorithmic complexity and computational speed have all been carried 
out with an eye to making the methodology more efficient in terms of computer 
storage and time. 

Chapter 1 presents a brief introduction to the problem, its possible applications 
in science and industry and our motivation in preferring this solution methodology 
over more traditional ones. 

Chapter 2 gives a short review of the fundamental laws that govern electromag- 
netic phenomena and the modeling technique of finite elements. The wave equation 
for electromagnetic fields is derived and various boundary conditions satisfied on 
material interfaces are presented. A brief outline of the method of finite elements 
and its application in electromagnetics is given. 

Chapter 3 provides a detailed review on the construction and implementation 
of scalar and vector shape functions for two- and three-dimensional finite elements. 
Traditional node-based shaped functions sure presented for a wide variety of element 
shapes and their pros and cons are outlined. The problem with nodal basis for a 
full-scale vector formulation is explained and edge-based vector shape functions are 
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introduced. The development of two- and three-dimensional edge bases is presented 
for a wide variety of element shapes. Higher order edge basis functions are pre- 
sented for triangles and tetrahedra along with other recently developed novel shape 
functions. 

Chapter 4 describes the formulation and implementation for closed and open 
domain problems. In the first part of the chapter, the closed problem is solved by 
determining the eigenvalues of an empty or filled metallic cavity. The origin and 
avoidance of spurious modes is discussed. The open problem is then formulated in 
the second part of the chapter and schemes for terminating the finite element mesh 
are discussed. The code is then validated for a wide class of perfectly conducting 
and composite geometries having arbitrary shapes. 

Since the finite element- absorbing boundary condition methodology becomes ex- 
tremely attractive for large problems, it is essential that the computer code be as 
computationally efficient as possible. Chapter 5 details the optimization and the 
subsequent parallelization of the finite element code and the various numerical con- 
siderations associated with it. The strategies for sparse matrix storage as well as 
solution of sparse systems using preconditioned iterative methods is outlined. The 
inherent parallelism in various types of point and block preconditioners is examined 
along with performance figures on the KSR1 and the Intel iPSC/80 massively parallel 
architectures. 

In Chapter 6, new mesh termination conditions which can be applied on termi- 
nation surfaces conformal to the target are derived and applied on some benchmark 
geometries. Since these ABCs are enforceable on doubly curved surfaces, dramatic 
reductions in computer storage and solution time are obtained. The improved bound- 
ary conditions are applied on mesh truncation surfaces composed of combinations of 
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cylinders, spheres and flat planes and their performance examined with respect to 
mesh termination distance and system symmetry. Extensions to more complex mesh 
termination boundaries are possible. 

Chapter 7 concludes the thesis by summarizing the important results obtained 
during the course of this work and its possible future extensions. 
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CHAPTER II 


Basic concepts of electromagnetics and finite 

elements 


As mentioned in the last chapter, this thesis deals with the application of finite 
elements to three-dimensional problems in electromagnetics. Therefore, it is impor- 
tant to have a grasp of both the finite element method as well as electromagnetic 
theory to solve these problems. This chapter is thus divided into two parts. The 
first part gives a brief review of the basic concepts of electromagnetics and the re- 
sulting differential and integral equations pertaining to our problem. The second 
part introduces the reader to the formulation of boundary-value problems with finite 
elements. 

2.1 Electromagnetics: basic concepts 

2.1.1 Maxwell’s equations 

Electromagnetic waves in all space are governed by a set of fundamental equations 
called Maxwell’s equations. In differential form, they are written as 


Vx E = - d a B { : 

(Faraday’s law) 

(2.1) 

VxB = J + § 

: (Maxwell- Ampere’s law) 

(2.2) 

V-D = p : (Gauss’ law) 

(2.3) 
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V B = 0 : (Gauss’ magnetic law) 


(2.4) 


where 


E = electric field intensity in volts/m 
J) = electric flux density in coulombs/sq. m 
H = magnetic field intensity in amperes/m 
B = magnetic flux density in webers/sq. m 

J = electric current density in amperes/sq. m 

p = electric charge density in coulombs/cu. m 


Another fundamental equation, frequently referred to as the equation of continuity, 
is given by 


V J = - 


dp 

dt 


(2.5) 


and expresses the conservation of charge. 

Of the five equations stated in this chapter, only three are independent. Either 
the first three equations, (2. 1-2.3), or the first two equations, (2.1 and 2.2) and 
the fifth equation (2.5), can be chosen as independent equations. The remaining 
two equations, (2.4 and 2.5) or (2.3 and 2.4), can be derived from the independent 
equations and are thus called auxiliary or dependent equations. 

The three independent equations cannot be solved since the number of unknowns 
exceeds the number of equations. Maxwell’s equations become definite only when the 
three constitutive relations between the field quantities are specified. These relations 
describe the macroscopic properties of the medium being considered. For a simple 
medium, they are given by 


D 


cE 


( 2 . 6 ) 
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(2.7) 

(2.8) 


B = n H 

J = <rE 

where the parameters e,p and <7 denote the permittivity (farads/m), permeability 
(henrys/m) and conductivity (mhos/m) of the material, respectively. For anisotropic 
media, the constitutive relations are given as 


D 

= z-E 

(2.9) 

B 

= Jt ■ H 

(2.10) 

J 

= crE 

(2.11) 


We will be considering only isotropic media in the following sections since the gen- 
eralization to anisotropy is trivial and introduces needless algebraic complexity. 

It is usually sufficient to consider the steady-state solution for electromagnetic 
fields as produced by currents having sinusoidal time dependence. The set of Maxwell’s 
equations, using complex phasor notation and the constitutive relations, can then be 
written as 


VxE 

= -jojfiH 

(2.12) 

VxH 

- J - jut E 

(2.13) 

V-(cE) 

= P 

(2.14) 

V-(^H) 

= 0 

(2.15) 

VJ 

= -J^p 

(2.16) 


where u> is the angular frequency of oscillation and the time convention e jmuJt is used 
and suppressed. 
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2.1.2 Wave equations 

The two curl equations, (2.1 and 2.2), can be combined together with the assumed 
constitutive relations, (2.7 and 2.8), to obtain a separate second-order differential 
equation for each field. By taking the curl of (2.1) or (2.2) and eliminating H or E 
respectively, we obtain 

V x —V xE — A^e r E = -juJ (2.17) 

/*r 

Vx^-VxH — kln r ¥L = ( 2 - 18 ) 

where k 0 = is the free-space wave number, e r and /z r are the respective 

relative permittivity and permeability of the medium under consideration and J is 
an impressed or source current. The differential equations shown above are called 
inhomogeneous vector wave equations in three dimensions. 

2.1.3 Boundary conditions 

Mathematically, the solution of a partial differential equation (PDE) like the wave 
equation, outlined in (2.17) and (2.18), is not unique in a region unless boundary 
conditions are specified, i.e, the behavior of the field on the boundary of the region 
of interest. Boundary conditions play the same role in the solution of PDEs that 
initial conditions play in the solution of differential equations for electric circuits. An 
electromagnetic problem is thus completely defined only when it contains information 
about the governing differential equation and the corresponding boundary conditions 
at material discontinuities or inhomogeneities. 

At the interface between two media, say medium 1 and medium 2, the boundary 
conditions can be mathematically expressed as 

nx(E a -E 2 ) = 0 (2.19) 


\V 
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n (A - £> 2 ) = 0 (2.20) 

for electric fields and 

nx(Hi-Hj) = 0 (2.21) 

n • (Bi - B 2 ) = 0 (2.22) 

for magnetic fields, where n denotes the unit normal to the interface. It is assumed 
in (2.20) and (2.21) that neither surface currents nor surface charge exist on the 
boundary. Equations (2.19) and (2.21) state that tangential electric and magnetic 
fields are continuous across dielectric boundaries. 

It is possible to simplify the above boundary conditions at the interface of a 
perfect electric conductor and free-space. Since a perfect conductor cannot sustain 
a field inside it and likewise since the flux lines of B axe continuous, (2.19) can be 
rewritten as 


n x E = 0 


and (2.22) reduces to 


(2.23) 


n • B = 0 (2.24) 

However, the conductor surface can support a surface current ( J 3 = n x H) and a 
surface charge (p a = n • D ). 

2.1.4 Radiation conditions 

It can be shown that the electric field within a finite volume can be derived in 
terms of the sources within the volume and the field values on the surfaces bounding 
that volume. If we make this volume infinitely large, we arrive at the Sommerfeld 
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radiation condition [10] given by 

li mr [^ +; *J=0 (2.25) 

r— oo (Jr 

where t/> is regular at infinity and describes a component of the electric or magnetic 
field and r = yfx 1 + y*. In three dimensions, the radiation becomes 

lim R [V xE + jk a R x El = 0 (2.26) 

R-+ oo l J 

where R = y'x 2 + y 2 + z‘K A similar result exists for the magnetic field. The radia- 
tion condition requires that E and H diminish as R _1 when R oo. 

2.1.5 Radar cross-section 

The radar cross-section (RCS) is a quantity characterizing the scattering from 
an obstacle. It is defined as the area intercepting that amount of power which, 
when scattered isotropically, produces at the receiver a power density equal to that 
scattered by the target under consideration. In the three-dimensional case, the RCS 

is defined as 

a(0 , </>) = lim 4jr/2 2 -p — ^ (2.27) 

v ,r/ ft— oo F* ,nc 

wher F a denotes the scattered field (either E* or H*) at the observation point 
(R, 0, <f>) and F inc represents the incident field , usually a plane wave, coming from 

{R,6 inc ,4> inc ). 

2.2 Finite elements: basic ideas 

The finite element method (FEM) is a numerical method for obtaining approx- 
imate solutions to boundary-value problems in physics and engineering. Very few 
analytical solutions are possible and thus a numerical method like the FEM provides 
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us with an alternative solution technique for these problems. For a particular class of 
such problems, there exist extremum principles by which the solution being sought 
makes an appropriate functional stationary, or, in certain cases, extremal. In other 
problems for which no genuine extremal principles can be derived, the error resulting 
from the substitution of the numerical approximation into the differential equation 
is minimized. 

In the following pages, we will give a brief description of the two accepted for- 
mulation schemes - the Ritz variational method [11] and the Galerkin method of 
weighted residuals [12]. We will then discuss how the finite element method is used 
to solve PDE problems formulated by the two abovementioned schemes. 

2.2.1 Boundary value problem 

We basically seek an unknown function u which satisfies a differential equation 

A(u) = Cu - f = 0 (2.28) 

in a domain f I and certain boundary conditions 

B(tt) = 0 (2.29) 

on the domain boundary F. In electromagnetics, the form of the governing differential 
equation ranges from the simple Poisson equation in statics to the complicated vector 
wave equation such as (2.17) and (2.18). The boundary conditions can also vary from 
Dirichlet and Neumann conditions to more complicated higher-order transition and 
radiation conditions. 

2.2.2 The Ritz method 

The Ritz method, or the Rayleigh- Ritz method, is a variational formulation where 
the solution to the boundary value problem is obtained by searching for the stationary 
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point of the functioned. If the operator C in (2.28) is self-adjoint and positive-definite, 
then the solution to (2.28) can be found by determining the stationary point of the 
functional [1] 

F{u) = \<Cu,u> - <u,f > (2.30) 

with respect to u, where u denotes the trial function. The inner product, denoted 
by angular brackets, is given by 

<a,b>= [ abdV (2.31) 

J Q 

Once the functional is found, we approximate the trial function by the expression 

u = '£ i C i w i = C T w (2.32) 

1=1 

where are basis functions and C% are constant coefficients to be determined. Sub- 
stituting (2.32) in the expression for the functional, we get 

F(u) = l -C T Qf w£w r C - ° T X w f dQ ( 2 - 33 ) 

where C is the column vector of unknown coefficients and the superscript denotes 
the transpose of a vector. On differentiating F(u) with respect to C and setting the 
resultant expression to zero - equivalent to finding the stationary point of F(ii) - we 
obtain a system of equations 

[A] {C} = {b} (2.34) 


where the elements of the matrix A and the vector b are given by 


A 


I WiCwj dSl 
J n 

/ Wif dSl 

J n 




b 


(2.35) 

(2.36) 
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On solving (2.34) for the unknown coefficients {C} of the finite element bases, we 
obtain a solution for the desired quantity everywhere within the computational do- 
main. 

It should be pointed out that the inner product as defined in (2.31) extends the 
applicability of the variational formulation to complex numbers. This is of utmost 
importance in electromagnetics since material fillings are often lossy. However, unlike 
in other branches of science, the functional does not have any physical significance in 
electromagnetics and is thus used sparingly. A further limitation is that the matrix 
A must be symmetric for a variational principle to exist. Problems which give rise 
to unsymmetric matrices need to be handled differently. The weighted residual or 
the Galerkin method provides an alternative, and simpler, approach of formulating 
the finite element equations. 

2.2.3 Galerkin’s method 

Galerkin’s method or the weighted residual method tries to minimize a residual 
in the mean square sense. If we assume that u is an approximate solution to (2.28), 
on replacing u with u in (2.28) we obtain the residual 

n = Cu - f / 0 (2.37) 

Naturally, the best approximation for u would be one that reduces the value of the 
residual to a minimum at all points in ft. The integral of the residual, weighted with 
some known weighting functions u>,, is then required to vanish in ft. 

J dft = 0 (2.38) 

In the Galerkin method, the weighting functions u>, are chosen to be identical to 
the basis functions used for expanding u. With this choice of weighting functions, 
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the residual K is orthogonal to the subspace of functions spanned by the basis of u, 
ensuring that the resulting approximation is in this sense the best possible from the 
space of approximating functions. The expression (2.38) then becomes 

J (wiCvf T c — wif'j dCl = 0 , i = 1 , • • • , N (2.39) 

where u has been expanded as in (2.32). This again leads to a matrix system identical 
to the one given in (2.34), obtained by the Ritz method. One of the advantages of this 
formulation is that the matrix system need not be symmetric for its validity. Also, 
the method can be applied to problems for which no genuine extremum principles 

exist. 

2.2.4 Implementation scheme 

The implementation scheme for FEM follows three broad outlines. 

Step 1 : At first, the problem is discretized by dividing the entire computational 
domain into simple subdomains, the elements. For two-dimensional problems, com- 
monly used elements are triangles, parallelograms and quadrilaterals with straight or 
curved edges. In three dimensional implementations, the elements of choice are usu- 
ally terahedra, curvilinear bricks or prisms with straight or curved surfaces [13, 14]. 

Step 2: Next, a suitable approximation function is chosen for the problem to be 
solved. The form of approximation depends on the type of element and must satisfy 
certain continuity conditions across inter-element boundaries. Further, the form of 
the polynomial function must remain unchanged under a linear transformation from 
one Cartesian coordinate system to another. This requirement is satisfied if the 
polynomials are complete to a specific order like 

u(x, y) = Ci + C 2 X + c$y + C 4 X 2 + c$xy - 1 - c$y (2.40) 

/ 

1 ^ 
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or when the extra terms are symmetric with respect to one another, as in the following 
incomplete third-order polynomial 

u(x, y) = ci + c 2 x + c 3 y + c 4 x 2 + c s xy + c^y 2 + c 7 x 2 y + c 8 xy 2 (2.41 ) 

Such approximation functions have the characteristic that, for fixed x or y, they are 
always complete ploynomials in the other variable. The two examples shown above 
apply in two-dimensions; the extension to three-dimensional elements is trivial. 

Once the order of the polynomial is selected, we can derive an expression for the 
unknown solution in an element, say the eth element, having the following form: 

«* = £ (2.42) 

k= 1 

where n is the number of bases in the eth element, u £ is the value of the unknown 
function at node, edge or facet j and N% is the basis (or shape) function for the 
element. 

Step 3: In the third step, we enforce the extremum principle by substituting 
(2.42) into the functional expression (2.30) or into the residual value (2.37). On 
imposing the stationarity of the functional as explained in Section 2.2.2, we obtain 
the system of linear equations (2.34). An alternative procedure of minimizing the 
weighted residuals (2.38) yields an identical system of linear equation (2.34). As 
mentioned earlier, both the variational (Ritz) and the weighted residual (Galerkin) 
formulations are equivalent when the matrix system is symmetric. In addition, the 
Galerkin method can handle unsymmetric systems. Finally, the solution of (2.34) 
specifies the values of the desired function everywhere within the computational 
domain. 

In the next chapter, we will be discussing step 2 in more detail. The two subse- 
quent chapters are devoted to the derivation of the finite element equations for our 
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application and the optimizations undertaken to enable rapid solution of the system. 


w 



CHAPTER III 


Shape functions for scalar and vector finite 

elements 


The finite element method is used for modeling a wide class of problems by 
breaking up the computational domain into elements of simple shapes. Suitable 
interpolation polynomials (or shape functions) are used to approximate the unknown 
function within each element. It is then possible to program the computer to solve 
complicated geometries by specifying the shape functions only. The element choice, 
however, needs human intervention and intelligence to ensure a reliable solution of 
the of the problem at hand. 

In this chapter, we will discuss the derivation of node-based and edge-based shape 
functions for two dimensional and three dimensional finite elements. Node-based 
shape functions have been used extensively in civil and mechanical engineering ap- 
plications as well as in scalar electromagnetic problems. However, a full three di- 
mensional vector formulation brings out numerous deficiencies in these traditional 
element shape functions [15, 16]. Edge-based shape functions have thus been derived 
to overcome the problems associated with nodal bases and are now being applied 
widely for solving vector problems in electromagnetics. We will also describe a gen- 
eral procedure for deriving higher-order shape functions for node and edge basis. 
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3.1 Node-based elements 

In node-based finite elements, the form of the sought function in the element 
is controlled by the function values at its nodes. The approximating function can 
then be expressed as a linear combination of basis functions weighted by the nodal 
coefficients. If the function values ti ■ at the nodes are taken as nodal variables, then 
the approximating function for a two-dimensional element e with p nodes has the 

form 

u e {x,y) = ^^i N t( x ^y) 

1 = 1 

Since the expression (3.1) must be valid for any nodal variable u', the basis function 
jVf (x, y) must be unity at node i and zero for all remaining nodes within the element. 

Shape functions can be derived either by inspection (Serendipity family) or through 
simple products of appropriate polynomials (Lagrange family). It is easier and more 
systematic to construct higher-order bases in the Lagrange family while progression 
to higher orders is difficult in the Serendipity family. However, Lagrange shape func- 
tions have undesirable interior nodes and more unknowns than Serendipity shape 
functions of the same order. All shape functions derived in the following sections 
impose function continuity or C 0 continuity (not slope continuity) between elements. 

3.1.1 Two dimensional elements 

Rectangular elements 

The simple shape of the rectangular element permits its shape functions to be written 
down merely by inspection. On examining the element shape given in Figure (3.1), 
the shape functions can be cast in the form 
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where x\ and y\ denote the coordinates of the mid-points of the edges, h% and h‘ 
represent the edge length and A e denotes the area of the element. Higher order 



Figure 3.1: Rectangular element 


rectangular elements are presented in Zienkiewicz [13]. However, these elements can 
model only regular geometries and are thus not very useful in practice. 

Irregular geometries can be modeled by using quadrilateral elements which can 
also be viewed as distorted rectangles. To construct basis functions for a quadrilateral 
element, we need to use a transformation that maps a quadrilateral element in the 
xy-plane to a square element in the £r? plane (Figure 3.2). Such a transformation can 
be found by satisfying the following relation at the four nodes of the quadrilateral 
element: 


x = a + b£ + cri-\- d£r) y = a' + b'£ + c'rj + d'£r) 


(3.2) 
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Figure 3.2: Transformation of a quadrilateral element in the xy plane to a unit square 
in the £77 plane 


On solving for the unknown coefficients a, . . . the basis functions can be cast in 
the following form 


Ni — — (1 + £ 0 ) (1 Tfo) ? i — 1, ... 9 4 


(3.3) 


where = ((, and i)„ = The variables (&,!);) denote the coordinates of the ith 
node in the (£, t/) coordinate system. 

Triangular elements 

Triangular elements are popular since they can model arbitrary geometries. We will 
determine the shape functions of triangular elements by using Lagrange interpolation 
polynomials. Let us consider a point P within a triangular element (Figure 3.3). The 
area of the smaller triangle formed by points p, 2 and 3 is given by 



x y 

A y\ 


A yl 


(3.4) 




l 
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3.1.2 Three dimensional elements 

Shape functions for three dimensional elements can be described in a precisely 
analogous way to their two dimensional counterpart. However, the simple rules for 
inter-element continuity given previously must be modified. The nodal field values 
should now interpolate to give continuous fields across the face of each element. 


Rectangular bricks 

The simplest polynomial approximation to a rectangular brick element is the trilinear 
function 


u e (x, y, z) = a e + b'x + c e y + d e z + e e xy + f'yz + g e zx + h e xyz 


(3.9) 


whose eight parameters are uniquely defined by the values of the function u at the 
eight corners of the brick. From the eight resulting equations, we can determine the 
coefficients a e ,b e ,...,h e and write the final expression in the form 


d e {x,y,z) = 'E,N?{x,y,z)u e i 
*= 1 


(3.10) 


However, this is a cumbersome process and can be easily avoided by writing down 
the required basis functions by mere inspection. Since the basis function N' must 
be unity at node i and zero at the remaining nodes, the eight interpolation functions 

ran be written down as 

NS = ^( I ' + T" I )(» S+ 2 L_!, )( Z ' + £"*) 

NS = ^(-*;+! + *)(»' + T -! ')( Z ' + 2^ 2 ) 

NS = ^ ^ + *) (“*' + + + 

ns = ^ + f - *) ("»' + 
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where x', y', and z\ denote the coordinates of the center of the element, ft' , ft' , and 
ft' represent the edge lengths of the element and V* denotes the element volume. 
Bricks with quadratic interpolation functions need 20 degrees of freedom and thus 
have node points at the corners and the mid-points of each edge. 

Shape functions for hexahedral elements or distorted bricks can be derived by 
mapping the element in the xyz coordinate system onto a standard cube in a new 
coordinate system. The required transformation yields 

g g g 

* = y = '£ N t^v,0vt, » = £A7K.<».0*f (Ml) 

i=l *=1 


where 


JVf = 1(1 + f,f)(i + <m)( 1 + Ci() (312) 

with denoting the coordinates of the ith node. 

Tetrahedral elements 

The three dimensional analogue of a two-dimensional triangle is a tetrahedron (four- 
faced element). Once again, we can introduce special coordinates, called volume 
coordinates or simplex coordinates, to simplify the derivation of shape functions. If 
P is a point within the tetrahedron shown in Figure 3.4, the four volume coordinates 
are given by 

Volume P234 

1 Volume 1234 
Volume P341 

L>2 = 


Volume 1234 
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Figure 3.4: Tetrahedral element 

Volume P412 

L 3 = Volume 1234 
Volume P123 

La = Volume 1234 


> (3.13) 


and any position within the element is specified by 

4 4 4 

x = Y! L i x i ; y = E '■> Z ~Y1 LiZi 


>=1 1=1 i=l 

Quadratic shape functions for a tetrahedron necessitates the use of ten node points 
- the 4 corner nodes and the remaining 6 on the mid-points of the edges. 


Other elements 

Other three-dimensional elements having simple shapes include the triangular prism 
and isoparametric elements. It is much easier to discretize a complicated structure 
by using parallelopipeds in combination with prism elements. To ensure that a small 
number of elements can model a relatively complex region, curved or isoparametric 
elements can be used. There is a good review of such elements in [13, 17]. 
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3.2 Edge-based elements 

In electromagnetics, we encounter several serious problems when node-based el- 
ements are employed to represent vector electric or magnetic fields. First, spurious 
modes are observed when modeling cavity problems using node- based elements [18]. 
Secondly, special care needs to be taken to impose boundary conditions at material 
interfaces and conducting surfaces [19]. The first limitation can also jeopardize the 
near-field results of a scattering problem, the far-field escapes contamination since 
spurious modes do not radiate. 

Edge-based finite elements, whose degrees of freedom are associated with the 
edges of the finite element mesh, have been shown to be free of the above shortcom- 
ings. They were described by Whitney [20] over 35 years ago and have been revived 
by Nedelec [21] and Bossavit and Verite [15] and Hano [22]. Mur and de Hoop [23], 
van Welij [24], Barton and Cendes [25] and Lee,et al [26] have extended their appli- 
cability to various two- and three-dimensional shapes and even constructed higher 
order elements for a more accurate approximation of the field values. 

3.2.1 Two dimensional elements 

Rectangular elements 

We first consider the rectangular element first since its shape function is usually the 
easiest to formulate. For the element shown in Figure 3.1, we can find its edge- 
based finite element basis function merely by inspection. If the edges are numbered 
according to Table 3.1 and considering that the basis function should be unity along 
one edge and zero over all others, the vector basis functions can be written as 

= £(-»+*+£)* 
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Edge no. 

i i 

*2 

1 

1 

2 

2 

4 

3 

3 

1 

4 

4 

2 

3 


Table 3.1: Edge numbering for rectangular element 


W' 2 = 

±1 
h e 1 

V 


\ X 

w% = 

1 

K 

( e h 

0 * 

w\ = 

l 

K 

( e 

[ X - X ‘ + Y J 



where x, y and z are the unit vectors in the Cartesian coordinate system. The electric 
field within the finite element is then given by 


E e = ^ E i W i ( 3 - 14 ) 

1=1 

where E- denotes the tangential field along the tth edge. The basis functions Nf 
guarantee tangential continuity across inter-element boundaries since they have a 
tangential component only along the ith edge and none along the other edges. They 
are also divergenceless within the element and possess a constant non-zero curl. It 
should be noted that by taking the cross-product of z with W ] , we obtain basis 
functions which possess normal continuity across element boundaries, have zero curl 
and non-zero divergence. The latter are ideal for representing surface current densi- 
ties and are known as roof-top basis functions in electromagnetics. They have found 
extensive use in the solution of integral equations [27]. 

Edge basis for quadrilateral elements can be derived by carrying out the trans- 
formation detailed in the derivation of nodal basis for quadrilaterals in the previous 
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section and then taking the gradient of the resulting expression for each edge. The 
edge-based quadrilateral element has two shortcomings. First, the integrals associ- 
ated with these elements do not lend themselves to easy evaluation and secondly, the 
basis functions are not divergence free. However, their ability to model complicated 
shapes with a lesser number of unknowns than tetrahedra and property of tangential 
continuity across elements make them attractive for use in two-dimensional vector 
formulations. 

Triangular elements 

Since the edges of an arbitrary triangular element are not parallel to the iory axis, 
it is not easy to guess the form of the vector basis function by inspection. Therefore, 
the edge basis for a triangular element is expressed in terms of its area coordinates, 
L\,L 2 and L 3 . These are the so-called Whitney elements. If the local edge numbers 
are defined according to Table 3.2, then edge bases for a triangular element are 
defined as 

W k = Ny = LiVLj - LjVLi, i,j = 1, . . . , 3 (3.15) 

where W k denotes the basis function for the fcth edge of the element. The vector 
field inside the triangular element can, therefore, be expanded as 

E e = j2E e k W e k (3.16) 

fc=i 

where El denotes the tangential field along the Arth edge. It can be easily shown 
that the edge-based functions defined in (3.15) has the following properties within 
the element: 


V-iV.y = 0 
V xNij = 


2 V Li x VLj 
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Edge no. 

*i 

1 2 

1 

1 

2 

2 

2 

3 

3 

3 

1 


Table 3.2: Edge numbering for triangular element 

If ei is the unit vector pointing from node 1 to node 2 in Figure 3.3, then ei-VLi = —1 
and ei • VT 2 = 1. Since L\ is a linear function that varies from unity at node 1 and 
zero at node 2 and L 2 is unity at node 2 and zero at node 1, we have 

• ^12 = L \ + Z> 2 = 1 (3-17) 

along the entire length of edge 1. This implies that JVi 2 has a constant tangential 
component along edge 1. Moreover, since Li vanishes along edge 2 and L 2 vanishes 
along edge 3, AT 12 has no tangential component along these edges. Thus, tangential 
continuity is preserved across inter-element boundaries but normal continuity is not. 
A different method of constructing edge basis functions for triangular elements is 
given in [28]. 

Higher order vector basis functions include the contribution of facet elements to 
the approximating function. Unknowns in the triangular element are assigned as 
shown in Figure 3.5 [29]. The tangential projection of the vector field along edge 
{i, j) is determined by two unknowns E{ and E] and two facet unknowns - F\ and 
Fi - are provided to allow a quadratic approximation of the normal component along 
two of the three edges. Only two facet unknowns are required to make the basis 
functions of second order complete. Therefore, there are 8 degrees of freedom for 


ny 
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Figure 3.5: Second order triangular edge element 
each triangular element. The higher order vector field within the element is given by 

3 3 

E e = E £ EfUVLj + FfLtLiVLs + F^L X L 2 VL 2 (3.18) 

1=1 j = 1 

where we have arbitrarily chosen the facet variables to lie on edges 1 and 2. These 
variables are local unknowns associated with each separate triangular element and are 
included to provide a linear approximation for V ( x E t , where the subscript t denotes 
the tangential component. Since the edge variables provide common unknowns across 
element boundaries, tangential continuity of the field over the boundary is assured. 
However, an obvious disadvantage of these elements is that the 2 facet variables 
cannot be symmetrically assigned. This disadvantage can be avoided by using third 
order elements [30]. 
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3.2.2 Three dimensional elements 


Edge-based elements have facilitated to a great degree the finite element analysis 
of three dimensional structures in electromagnetics. Linear nodal basis with their 
problem of spurious modes and difficulty in maintaining only tangential continuity 
across material interfaces are not as convenient for electromagnetic field simulations 
in three dimensions. On the other hand, the introduction of edge based shape func- 
tions provide a robust way of treating general three dimensional problems having 
material inhomogeneities and structural irregularities like sharp edges and corners. 

In the following section, we will consider the simple rectangular bricks first and 
will proceed to derive edge-based shape functions for more complicated structures 
like tetrahedrals and curvilinear hexahedrals. The chapter is concluded with a brief 


discussion on hierarchical edge elements. 

Rectangular bricks and hexahedrals 

As in the two dimensional case, we derive the edge-based shape function for a rect- 
angular brick (see Figure 3.6) by simple inspection. Since a constant tangential field 
component must be assigned to each edge of the element, we can express the shape 
function along each edge of the element as [31] 
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where h e x ,h e y ,h e z denote the edge lengths in the x, y, and 2 directions, respectively, 
and the center coordinates of the brick are given by ( x e c ,y e c , z®). If the edge numbers 
are defined as in Table 3.3, the expression for the vector field within the element can 
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be expressed as 


12 


E' = ££;w; 


(3.19) 


1=1 


where E* represents the value of the electric field along the ith edge. The vector basis 


Edge no. 

Node tj 

Node i 2 

1 

1 

2 

2 

4 

3 

3 

5 

6 

4 

8 

7 

5 

1 

4 

6 

5 

8 

7 

2 

3 

8 

6 

7 

9 

1 

5 

10 

2 

6 

11 

4 

8 

12 

3 

7 


Table 3.3: Edge definition for rectangular brick 


Nf defined for the rectangular brick element have zero divergence and a nonzero curl. 
Furthermore, the expansion (3.19) guarantees tangential continuity of the electric 
field across the surfaces of the elements. 

A rectangular brick element has limitations in the sense that it is unable to 
model irregular geometries. Due to this reason, the analog of the two dimensional 
quadrilateral ( the hexahedral element) finds much wider use in modeling practical 
three dimensional problems. As in the case of the quadrilateral element in two 
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dimensions, a hexahedral element in Cartesian coordinates can be seen as the image 
of a unit cube under a trilinear mapping to the coordinate system (see Figure 



Figure 3.7: Mapping of a hexahedral element to a unit cube 

Let us consider those faces for which £ =constant. Therefore, V £ must then 
possess only a normal component on that face. Since £ varies linearly along the 
edges that are parallel to the £-axis, the vector function V£ has nonzero tangential 
components only along those edges that are parallel to the £-axis. Using the node- 
based expression for the shape function in a hexahedral element given in ( 3 . 12 ), we 
may write the corresponding edge bases as 


wf-^i + wHi + CiCm, 

edges || to £-axis 

(3.20) 

w' = ^( i+f.-0(i + CiC)VT, 

edges || to 77 -axis 

(3.21) 

w: = |(i+«)(i+^)vc, 

edges || to £-axis 

(3.22) 


where (&,»?«, C») denote the coordinates at edge i. 

The vector bases derived above possess all the desired continuity properties of 
edge elements and generally result in about half the number of unknowns than that 



obtained by tetrahedral gridding. However, these basis functions are not divergence- 
less and it is difficult to generate a finite element mesh of an arbitrary structure using 
hexahedral elements. 

Tetrahedral elements 

Tetrahedral elements are, by far, the most popular element shapes to be employed 
for three dimensional applications. This is because the tetrahedral element is the 
simplest tessellation shape capable of modeling arbitrary three dimensional geome- 
tries and is also well-suited for automatic mesh generation. The derivation of shape 
functions for these elements follow the same pattern as that for triangular vector 
basis functions. If we consider the tetrahedron shown in Figure 3.4 and define the 
edge numbers according to Table 3.4, we have 


m = JV'i = L\VL) - L'VLl i, j = 1, . . . , 4 


(3.23) 


and the vector field within the element can be expanded as 


Ar=l 


(3.24) 


A nice explanation of the physical character of the edge-based interpolation function 
is given by Bossavit [32]. Let us consider edge number 1 connecting nodes 1 and 2. 
Since VZ 2 is orthogonal to facet {134} and VZi is orthogonal to facet {234}, the 
field turns around the axis 3-4 and is normal to planes containing 3 and 4. The field 
thus has only tangential continuity across element faces. Edge elements can also be 
described as Whitney elements of degree 1. 

Whitney elements of the second degree are called facet elements because they are 
constant over the face of the tetrahedron. The vector function for the facet element 
can be written as 


N ijk — 2 (LiVLj x V Lk + LfVLk x VZ,- + Z*VZj x VZ_,) , i,j,k = 1, . . . ,^3.25) 
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Edge no. 

Node t'i 

Node *2 

1 

1 

2 

2 

1 

3 

3 

1 

4 

4 

2 

3 

5 

4 

2 

6 

3 

4 


Table 3.4: Edge definition for tetrahedron 


As explained in [32], we now have a central field (emanating as if from node 4 in 
Figure 3.4) on each of the two tetrahedra that share the face {1,2,3}. The field can 
be imagined as coming from the ‘source’ 4, growing, crossing the facet and vanishing 
into the ‘well’ 4', the fourth vertex of the other tetrahedron. This field thus has 
normal continuity and the flux across the facet forms the degree of freedom for the 
element. 

Alternative expressions for linear basis inside a tetrahedron have been derived in 
[25]. They are given by 


with 


wu = 


f V-i + gi-i x r, 

0 , 


r in the tetrahedron 
otherwise 


(3.26) 


fV- 


g7 — , 


hj-i 

6V e 

6V e 


r„ x r, 2 


(3.27) 

(3.28) 


in which i = 1, 2, . . . , 6, V e is the volume of the tetrahedral element, = (r, 2 — )/ /i, 
is the unit vector of the ith edge and /i, = |r, 2 - r„ | is the length of the ith edge 
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with i\, and i\ s denoting the position vector of the t’i and i 2 nodes. It can be shown 
that (3.23) is identically equal to (3.26) when simplified. Therefore, 


g7_, = / 17 -i x 1 * — 1, • • • 


where i t and i 2 are given in Table 3.4. The basis functions given in (3.26) have zero 

divergence and constant curl (VxW^ = 2g'). 

The order of the polynomial approximation for the first order edge element given 
in (3.23) or (3.26) can be taken as 0.5. This is because the value of the basis function is 
constant (0(1)) along the edge it supports and is linear (O(r)) everywhere else within 
the element. Mur and de Hoop [23] presented edge elements which are consistently 
linear, yielding a linear approximation of the field both inside each tetrahedron and 
along its edges and faces. Since this requires two unknowns per edge, there are 
12 degrees of freedom per element. The basis functions in [33] are derived by first 
defining the outwardly directed vectorial areas of the faces as 

Ai = Tj x r k + rk x ri + ri x r, (3.29) 


where r,, i = 1, ... ,4 denote the position vectors of the vertices of the tetrahedron 
and are cyclic. Then the edge-based vectorial expansion function is defined 


by 


Nii(r) = 


M r ) A i 

3V 


— 1) ■ • • / J 


(3.30) 


where V is the volume of the tetrahedron and <f>(r) is a linear scalar function of 
position given by 



(r - r t ) • Ai 

SV 


in which rj, is the position vector of the centroid of the tetrahedron. We observe that 
<j>i(r) equals unity when r = r; and zero for the remaining vertices of the tetrahedral 
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element. In that sense, they are very similar to the simplex or volume coordinates 
mentioned earlier. They also satisfy the following equalities: 

r = 53^i(r)r, 

*=1 

Z < M r ) = i 

1=1 

The edge basis function JV,j is a linear vector function of position inside the tetra- 
hedral element and its tangential component vanishes on all edges of the element 
except the one joining vertices i and j. N tJ varies linearly along the edge formed by 
nodes i and j such that Nij • Tj = 0 while 

ATii • (ri - fj) = 1 

These basis functions have non-zero values of divergence and curl. 

An inspection of the expressions for the vectorial areas reveals that the form is 
identical to that obtained by taking the gradient of one of the simplex or volume 
coordinates mentioned earlier. In other words, the three components of the vector 
A\ have the same functional dependence as that obtained by VIq 

1 2/2 
*3 1 2/3 

x 4 1 y 4 

where L\ is the volume coordinate for a tetrahedron defined in (3.13), det indicates 
the value of the determinant of the matrix and Xi,yi,Zi denote the coordinates of 
the ith vertex. The basis functions with consistently linear interpolation in the 
tetrahedron can now be rewritten in more convenient notation as 


Vi > = 6V 



2/2 

1 

22 


x 2 

1 

22 


x det 

2/3 

1 

23 

— y det 


1 

2 3 

+ z det 


2/4 

1 

2 4 


x 4 

1 

2 4 



W/. = N,j = LiVLj, i,j = jtj 


(3.31) 
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Still higher order basis functions are sometimes necessary for rapidly changing 
fields or for modeling extremely thin structures where linear interpolation for the 
highly stretched elements is not enough. The second order edge basis (0(r 15 )) for 
a tetrahedral element was first presented by Lee, Sun and Cendes [33]. We need 20 
degrees of freedom to achieve a quadratic approximation of the vector field inside 
a tetrahedron (see Figure 3.8). Accordingly, the field within a tetrahedron can be 
written as 

E = £ £ EIL^L, + £ {F[L t L 3 VL k + F' 2 L x L k VL } ) (3.32) 

i=l j=l »= 1 

where i,j,k form cyclic indices. The facet variables F{ and are common unknowns 



Figure 3.8: Tetrahedral element 

for two tetrahedra that share the same face. Even higher order edge-based elements 
up to polynomial order 2 can be constructed. Each tetrahedral element now has 30 
unknowns - 3 along each edge and 3 on each face. 
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Other elements 

Recently, Wang and Ida proposed a systematic method for the construction of curvi- 
linear elements in [34]. The vector shape function is expressed in the following form: 

w i( r ) = Mt* 1 hQ v <'( r )> i = (3.33) 

where <£,(£,» 7,C) are completely defined in the local coordinate system, t>, contains 
the edge and facet information and M denotes the number of degrees of freedom in 
the element. These basis functions usually lead to a symmetric system of equations. 
However, it is difficult to find commercial mesh generation packages which construct 
curvilinear elements for a wide range of geometrical configurations. 

Hierarchical vector elements 

Finite elements are said to be hierarchical when the basis functions for an element 
are a subset of the basis functions for any elment of higher order [13]. The basis 
functions described in [35] are hierarchical and tangentially continuous. Vector el- 
ements complete upto polynomials of order 2 axe available and basis functions of a 
given order are fully compatible to be used with basis functions of lower or higher 
orders. Thus elements of different orders could be used in the same mesh - lower 
order elements could be used in regions where field variation is uniform and higher 
order elements employed in regions where the field varies rapidly. 

The implementation of hierarchical vector elements can be a bit tricky, especially 
at the transition boundaries where elements of one order merge into the elements 
of another order. If several vector elements share an edge, the field tangent to the 
edge can be made identical in each of the tetrahedra. This is done by setting the 
coefficient of the corresponding basis function for the edge in all tetrahedra to be 
identical. For tangential continuity across a face, the same equality must be enforced 
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between the coefficients of all the edge and facet functions associated with the face. 
Table 3.5 gives the basis functions for hierarchical vector finite elements. Higher 
order basis functions are constructed by systematically adding the extra terms upto 
the desired order. It should be noted that the bases for the tetrahedron with 6 and 


Element type 

Polynomial 

order 

Unknowns 
per element 

Basis function 

Edge 

0.5 

6 

LiVLj - LjV Li 

Edge 

1 

12 

V {LiLj) 

Face 

Face 

1.5 

20 

LkLjVLi 

LkLiVLj 

Edge 

Face 

2 

30 

ViLiLjiLi-Lj)) 

VlLiLjLk] 


Table 3.5: Hierarchical basis functions for tetrahedron 


20 unknowns shown in Table 3.5 is identical to the linear and second order edge basis 
given in (3.23) and (3.33), respectively. 



CHAPTER IV 


Vector finite elements for 3D electromagnetic 

problems 


Finite elements have been used extensively to model open and closed domain 
electromagnetic problems in scalar form in two and three dimensions[14, 36, 37]. 
But a reliable full vector formulation proved to be extremely difficult to implement. 
The cause of the problem was found to be the traditional nodal basis functions that 
were being used to discretize the unknown field variable. The reasons for the failure 
of node-based elements in modeling the vector wave equation will be discussed in 
a later section. Fortunately, a novel remedy was found by assigning the degrees of 
freedom to the edges rather than to the nodes of elements. These types of elements 
had been described by Whitney[20] in terms of geometrical forms about 35 years 
back and were revived by Nedelec [21] in 1980. In recent years, Bossavit [15] and 
others [22, 23, 24, 25] applied these edge-based finite elements successfully to model 
three dimensional problems. In all these works, edge elements were seen to be devoid 
of the shortcomings commonly experienced with node-based elements. 

The goal of this thesis was to develop a general purpose code for computing the 
scattering pattern of three dimensional composite structures having complex shapes. 
Edge based functions with their robustness in modeling general three dimensional 
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problems were evidently our choice. Since the only simple shape to model an arbi- 
trary three dimensional space is a tetrahedron, we settled on edge-based tetrahedra 
as our mesh discretization units. The mesh truncation was chosen to be done with ab- 
sorbing boundary conditions (ABCs) which are local in nature, preserve the sparsity 
of the finite element system and permit scalability to large problems with minimal 
storage -O(N)- and computational time - 0{k * N),k < N - penalties. 

In the first part of this chapter, we present the weighted residual or weak for- 
mulation for the closed domain problem and solve for the eigenvalues of a metallic 
cavity having arbitrary shape. In passing, we briefly describe the problem of spurious 
solutions encountered with node-based elements. We also validate our methodology 
by comparing the computed eigenvalues with analytically derived ones. In the latter 
part of the chapter (Part II), we formulate the open domain problem in terms of the 
variational functional, describe the enforcement of boundary conditions for perfectly 
conducting and composite targets and present the proof of the mesh termination 
condition in detail. We then validate our solution by comparison with measured or 
analytically derived data. 

Part I : CLOSED DOMAIN PROBLEM 

Solving Maxwell’s equations for the resonances of a closed cavity is important in 
understanding and controlling the operation of many devices, including particle ac- 
celerators, microwave filters, microwave ovens and optical fibers. However, the exact 
eigenvalues can be obtained only for simple geometries. For arbitrarily shaped cav- 
ities, numerical techniques like the finite element method must be used, but the 
occurrence of spurious modes [18] in the node-based finite element approach has 
plagued the computation of their eigenvalues. This difficulty can be circumvented 
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with the introduction of a penalty term [38] to render the finite element vector field 
solutions non-divergent. However, it is difficult to satisfy continuity requirements 
across material interfaces and treat geometries with sharp edges [39] using classical 
finite-elements, obtained by interpolating the nodal values of the vector field com- 
ponents. As mentioned in the Introduction to this chapter, edge elements, a type 
of vector finite elements with their degrees of freedom associated with the edges of 
the mesh, have been shown to be free of these shortcomings. Generally these lead 
to more unknowns but the higher variable count is balanced by the greater sparsity 
of the finite element matrix so that the computation time required to solve such a 
system iteratively with a given accuracy is less than the traditional approach [25]. 

Here we solve for the eigenvalues of an arbitrarily shaped metallic cavity using 
node-based and edge-based vector finite elements. The computed data are then 
compared with analytical results for empty and partially filled cavities. A comparison 
between the storage intensity and computational accuracy for edge- based rectangular 
bricks and tetrahedra is also presented. Finally, we compute the eigenvalues of a 
metallic cavity with a ridge along one of its faces. 

4.1 Formulation 

4.1.1 Finite element equations 

Consider a three dimensional inhomogeneous body occupying the volume V. To 
discretize the electric field E within this volume, we subdivide the volume into small 
tetrahedra or rectangular bricks, each occupying the volume V e (e = 1,2, ...,Af), 
where M is the total number of elements. For a numerical solution, we expand E 
within the eth volume element as 
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E = I>J W i (4-1) 

j=i 

where W' are the edge-based vector basis functions, E* denote the expansion coef- 
ficients of the basis, m represents the number of edges comprising the element and 
the superscript stands for the element number. On substituting this into the usual 
vector wave equation and upon applying Galerkin’s technique, some vector identities 
and the divergence theorem, we obtain the weak form of Maxwell’s equation 


II 

— (V x 

w?) • (V x W') - k 2 0 e r w: ■ w; 

j=l JV ‘ 1 

M T 

■ 

jkoZo 

£ w '- 

(n x H) ds 


(4.2) 


where R- represents the weighted residual integral for element e, S e denotes the 
surface enclosing V e , n is the outward unit vector normal to S e , Z Q is the free- 
space intrinsic impedance and e r , \i T is the material permittivity and permeability, 
respectively. Equation (4.2) can be conveniently written in matrix form as 


{i?} = (4.3) 


where 


= f — (V x W‘)-(Vx Wj)dv 

(4.4) 

3 JV e fir 


Bti = jf, erWf .WJ* 

(4.5) 

c; = jkoZol W* • (n x H) ds 

J 5e 

(4.6) 


find on assembling the equations from all the elements making up the geometry, we 
obtain the system 


M 


M 


M 


e=l 


e=l 


e=l 


Ew) = e m £[*']{£■} -£{£'} 

e=l 


= { 0 } 


/ 


(4.7) 
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where all matrices and vectors following the summation sign have been augmented 
using global numbers. 

Due to the continuity of tangential H at the interface between two dielectrics, an 
element face lying inside the body does not contribute to the last term of (4.7) in the 
final assembly of the element equations. As a result, the last term of (4.7) reduces to 
a column vector containing the surface integral of the tangential magnetic field only 
over the outer surface of the body. In this application, the surface enclosing the vol- 
ume of the body V is perfectly conducting and, thus, the coefficients associated with 
the edges bordering the perfectly conducting surface can be set to zero a priori. This 
reduces the original unknown count and eliminates the need to generate equations 
for those edges/unknowns which would have otherwise involved the column vector 
{C e }. Also since { C e } is only associated with boundary edges, the surface integral 
associated with it vanishes and (4.7) can be written as 

[/(]{£} = A [£]{£> (4.8) 

where [A] and [ B ] are N x N symmetric, sparse matrices with N being the total 
number of edges resulting from the subdivision of the body excluding the edges on 
the boundary, { E } is a IV x 1 column vector denoting the edge fields and A = Jc% 
gives the eigenvalues of the system. A solution of (4.8) will yield the resonant field 
distribution {£} and the corresponding wavenumber k 0 . 

4.1.2 Origin of spurious solutions 

Conventional finite element basis functions give rise to spurious solutions when 
(4.8) is solved. As Wong and Cendes points out in [40], the origin of these spurious 
solutions lies in the infinitely degenerate eigenvalue k = 0 in the spectrum of (4.8). 
Given the eigenvalue system in 4.8 along with the PEC boundary condition n x E = 0 
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on the boundary, there exists an infinite number of scalar functions t/> such that t/> - 0 
on the boundary. Then E = -V0 is a permitted eigenfunction corresponding to the 
eigenvalue k = 0. If the discretization scheme fails to model this infinite dimensional 
nullspace of the curl operator exactly, spurious solutions to the eigenvalue problem 

will appear. 

One way to get rid of spurious modes is to formulate the eigenvalue problem such 
that k = 0 is no longer a permissible eigenvalue. This is achieved by enforcing 

V • E = 0 (4-9) 

exactly everywhere in the solution region. Then the only solution corresponding to 
the k = 0 eigenvalue is the trivial one E = 0. This is also the reason why spurious 
solutions do not occur when the Helmholtz equation is discretized. In finite elements, 
solving a problem (4.8) along with a constraint (4.9) is well known [13]. Researchers 
have mostly tried the penalty function approach of constrained minimization [38, 39] 
since it is simple to implement. However, the penalty approach is a mere fix and 
not a cure for the problem. Since the spurious eigenmodes are now shifted far into 
the visible spectrum, they are not completely eliminated and are dependent on an 
user-defined parameter which specifies how strongly the divergenceless condition is 

to be imposed. 

Other than the penalty method, derivative continuous finite elements (C 1 ele- 
ments) have also been proposed [40] to alleviate this problem. In this method, an 
auxiliary vector field £ is introduced such that 

E = VxC ( 4 * 10 ) 

Since substitution of (4.10) for E into (4.8) results in second derivatives, we need 
to construct first derivative continuous elements or C 1 elements. As shown in [40], 
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discretization of E using node-based C 1 elements eliminates the problem of spurious 
solutions since the nullspace of the curl operator is modeled exactly. However, C 1 
elements are not commonly found in finite elements and need to be explicitly derived 
for the problem at hand. 

Another method of eliminating spurious modes, without getting rid of the eigen- 
value k = 0, is by using edge elements [41]. Bossavit in [41] provides a mathematical 
proof as to why spurious modes do not appear with edge based elements and why 
they are likely to be present with node-based vectorial elements. However, special 
node-based elements, like the C 1 element in [40], do not present this problem. 

Thus the root cause of spurious modes appears to be the improper modeling of the 
nullspace of the curl operator. Any basis function which approximates it correctly 
will be stable and free of spurious modes. As it turns out, conventional Lagrangian 
finite elements are unsuitable; either C 1 node-based elements or edge-based elements 
of any order can be used to obtain the true solutions. 

4.1.3 Basis functions 


Edge no. 

*i 

*2 

1 

1 

2 

2 

1 

3 

3 

1 

4 

4 

2 

3 

5 

4 

2 

6 

3 

4 


Table 4.1: TETRAHEDRON EDGE DEFINITION 
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The vector edge-based expansion functions for rectangular bricks were presented 
in [31]. Vector fields within tetrahedral domains can be conveniently represented 
by expansion functions that are linear in the spatial variables and have either zero 
divergence or zero curl. The basis functions defined in [25] are associated with the 
six edges of the tetrahedron and have zero divergence and constant curl. To define 
them, let us assume that i x and i 2 are the terminal nodes of the ith edge and the 
six edges of a tetrahedron are numbered according to Table 4.1. The vector basis 
function associated with the (7 — i)th edge of the tetrahedron is then given by 


with 



fV-t + 87 -i x r, 

0, 


r in the tetrahedron 
otherwise 


(4.11) 


fV-« 


87— t 


br -. 

sT" x r ' 

^i^7— i®» 

“6 vT 


(4.12) 

(4.13) 


in which i = 1, 2, . . . , 6, V e is the volume of the tetrahedral element, e; = (r,- 2 - r tl )/6, 
is the unit vector of the tth edge and 6,- = | r » 2 — r tl | is the length of the tth edge 
with r,j and r, 2 denoting the location of the t x and i 2 nodes. 

In general, the implementation of the above discretization will involve two num- 
bering systems, and thus some unique global edge direction must be defined to ensure 
the continuity of n x E across all edges [42]. Here we choose this direction to be 
coincident with the edge vector pointing from the smaller to the larger global node 
number. Finally, since V • W? = 0, the electric field obtained from a solution of (4.3) 
satisfies the divergence equation within each element and, thus, the solution will be 
free from contamination due to spurious solutions. 


-> 
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Mode 

Analytical 

Computed 

(bricks) 

270 

unknowns 

Computed 

(tetra.) 

260 

unknowns 

Error (%) 
(bricks) 

Error (%) 
(tetra.) 

TE 10 i 

5.236 

5.307 

5.213 

-1.36 

.44 

TM no 

7.025 

7.182 

6.977 

-2.23 

.70 

TE 0 ii 

7.531 



-2.58 


TE 2 oi 




-3.13 

-.56 

TMui 

8.179 


7.991 


2.29 

TE U1 



8.122 


.70 

TM 2 10 

8.886 

9.151 

8.572 

-2.98 

3.53 

te 102 

8.947 

9.428 

8.795 

-5.38 



Table 4.2: EIGENVALUES (Ar 0 ,CM -1 ) FOR AN EMPTY lCM X 0.5CM X 0.75CM 
Rectangular Cavity 

4.2 Results 


In Table 4.2, we present a comparison of the percentage error in the computation 
of eigenvalues for a 1cm x .5cm x ,75cm rectangular cavity using edge-based rect- 
angular bricks and tetrahedra. The edge-based approach using tetrahedral elements 
predicts the first six distinct non-trivial eigenvalues with less than 4 percent error 
and is seen to provide better accuracy than rectangular brick elements. The maxi- 
mum edge length for the rectangular brick elements was .15cm whereas that for the 
tetrahedral elements was .2cm. To investigate this matter further, we considered a 
cubical metallic cavity having a side length of .5cm. A plot of the percentage error 
in calculating the first three degenerate resonant frequencies versus the number of 
unknowns is given in Figure 4.1 for both rectangular bricks and tetrahedral elements. 
It is clear in this example that the tetrahedral elements predict the eigenvalues with 
greater accuracy than the rectangular bricks. 

In Tables 4.3, 4.5 and 4.4, we compare the exact eigenvalues with those computed 
using edge-based tetrahedral finite elements. The finite element mesh was generated 
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Figure 4.1: Performance comparison of rectangular bricks and tetrahedrals. 

using SDRC I-DEAS, a commercial pre-processing package and it is seen that the 
numerical results are in good agreement with the exact values for both homogeneous 
and inhomogeneous cavities. The exact eigenvalues of the half-filled cavity as de- 
scribed in Table 4.3 are computed by solving the transcendental equation obtained 
upon matching the tangential electric and magnetic fields at the air-dielectric inter- 
face. As seen, these results agree with those predicted by the finite element solution 
to within 1 percent (no symmetry was assumed in this solution). Similar comparisons 
are given in Table 4.4 for a sphere having 1cm radius. 

Finally, Table 4.6 presents the eigenvalues of the geometry illustrated in Figure 
4.2. This is a closed metallic cavity with a ridge along one of its faces. 

It is noted that as the degeneracy of the eigenvalues increases, the matrix be- 
comes increasingly ill-conditioned and the numerical solution is correspondingly less 
accurate [43]. This is clearly observed from the data in Table 4.4 for the case of a 
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Mode 

Analytical 

Computed 

192 

unknowns 

Error (%) 

TE^ioi 

3.538 

3.534 

.11 

TEz 2 oi 

5.445 

5.440 


TE^io2 

5.935 

5.916 

.32 

TE2 301 

7.503 

7.501 


TE^202 

7.633 

7.560 

.97 

TEzioa 

8.096 

8.056 



Table 4.3: Eigenvalues (^cm' 1 ) for a Half-Filled 1cm x 0.1cm x 1cm Rectangular 
Cavity Having a Dielectric Filling of c T = 2 Extending from 2 = 0.5cm to 
2 = 1.0cm. 



Figure 4.2: Geometry of ridged cavity 

perfectly conducting hollow spherical cavity. Since the second lowest TM mode has 
five-fold degeneracy, the computational error is seen to be the greatest. However, for 
the partially filled rectangular cavity, the absence of degenerate modes gives results 
which are accurate to within 1 percent of the exact eigensolutions. We finally remark 
of the inherent presence of zero eigenvalues in our computations whose number is 
equal to the internal nodes. These zero eigenvalues axe easily identifiable and since 
they do not correspond to physical modes, they were always discarded. 
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Mode 

Analytical 

Computed 

Error (%) 



300 




unknowns 


TMqio 

2.744 

2.799 

-2.04 

TMin f even 


2.802 

-2.11 

TMm.odd 


2.811 

-2.44 

TM 021 

3.870 

3.948 

-2.02 

TMi21,even 


3.986 

-2.99 

TMi2l,odd 


3.994 

-3.20 

TM 221 .even 


4.038 

-4.34 

TM221,odd 


4.048 

-4.59 

TEon 

4.493 

4.433 

1.33 

TEni.even 


4.472 

.47 

TEm ( odd 


4.549 

-1.25 


Table 4.4: Eigenvalues (k 0 , cm x ) for an empty spherical cavity of radius 1cm 


4.3 Conclusions 


It was shown that the resonant frequencies of an arbitrarily shaped mhomo- 
geneously filled metallic resonator can be computed very accurately via the finite 
element method using edge-based tetrahedral elements. The same method in con- 
junction with node-based elements is much less reliable and not readily applicable 
to regions containing discontinuous boundaries in shape and material. Edge-based 
rectangular bricks do not provide as good an accuracy as edge-based tetrahedral 
elements and their use is further limited to a special class of geometries. 


Part II : OPEN DOMAIN PROBLEM 

Of generic interest in electromagnetic scattering is the modeling of composite config- 
urations comprised of metallic and non-metallic sections. In the case of man-made 
structures, abrupt material discontinuities and metallic corners are also encountered 
along with resistive sheets and thin ferrite coatings intended for controlling the scat- 


55 


Mode no. 

Analytical 

Computed 

Error(%) 

TM 010 

4.810 

4.809 

.02 

TE 111 

7.283 

7.202 

1.1 



7.288 

-.07 

TM 110 

7.650 

7.633 

.22 



7.724 

-.97 

TM 011 

7.840 

7.940 

-1.28 

TE 211 

8.658 

8.697 

-.45 



8.865 

-2.39 


Table 4.5: Eigenvalues for an empty cylindrical cavity of base radius 0.5cm and height 
0.5 cm (380 unknowns) 

terer s radar cross-section (RCS). Differential equation methods, especially the finite 
element method (FEM), with its capability of handling arbitrary geometries and its 
versatility in modeling inhomogeneities and material discontinuities has been a viable 
solution approach for bounded domain problems. However, for unbounded problems 
as is the case with electromagnetic scattering, the solution is more involved since the 
finite element mesh needs to be truncated artificially at some distance from the object 
with a suitable boundary condition. These boundary conditions can be either global 
or local. Global boundary conditions are exact but lead to fully populated submatri- 
ces thus spoiling the sparse, banded structure of the finite element system. Further, 
problems due to internal resonances may arise in many cases [44]. In contrast, local 
conditions such as the absorbing boundary conditions (ABCs), are approximate but 
have the important advantage of retaining the sparsity of the matrix system. Also, 
they are free from the interior resonance problem that plagues boundary integral 
termination schemes [44]. ABCs are essentially differential equations enforced at the 
mesh truncation boundary and are chosen to suppress non-physical reflections from 
that boundary, thus ensuring the outgoing nature of the waves. 
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No. 

(a) 

(b) 

1 

4.941 

4.999 

2 

7.284 

7.354 

3 

7.691 

7.832 

4 

7.855 

7.942 

5 

8.016 

7.959 

6 

8.593 

8.650 

7 

8.906 

8.916 

8 

9.163 

9.103 

9 

9.679 

9.757 

10 

9.837 

9.927 


Table 4.6: Ten lowest non-trivial eigenvalues (*o,cm x ) for the geometry drawn in 
Figure 2: (a) 267 Unknowns; (b) 671 Unknowns 

A variety of ABCs have been derived and widely employed in FEM solutions of 
open region two-dimensional scattering problems. However, the method’s implemen- 
tation and performance for scattering by three dimensional geometries using edge- 
based finite elements has not received similar attention. The only three-dimensional 
implementations of the FEM for scattering has been a hybrid solution combined with 
the boundary element method (BEM) [42, 31, 45] and those formulations combined 
with ABCs [46, 47]. The boundary element method, though exact, is equivalent to 
employing a global boundary condition for terminating the mesh and consequently 
leads to a full submatrix, restricting the method’s utility to small geometries. For 
large-scale three-dimensional applications, it is necessary to employ an ABC for ter- 
minating the mesh to retain the 0{N) storage requirement, characteristic of the 
finite element method. However, the use of traditional node-based elements suffers 

from various difficulties as mentioned in the Part I. 

To avoid these difficulties, we consider an implementation of the FEM using vec- 
tor basis functions whose degrees of freedom are associated with the fields along 
the six edges of a tetrahedron. Our implementation is further coupled with a mesh 
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termination scheme based on the vector ABCs derived in [48, 49]. In contrast to 
the implementation proposed in [50], the one presented here preserves the symmetry 
of the finite element system, thus being computationally more efficient and making 
it ideally suited for solution via a conjugate gradient type of algorithm. Further, 
the implementation discussed in [50] requires that the absorbing boundary be placed 
nearly a wavelength away from the scatterer, whereas in our implementation re- 
markably accurate results are obtained with the ABCs enforced a small fraction of 
a wavelength from the scattering body. This is probably due to the accuracy of the 
second order ABCs derived in [49]. 

4.4 Formulation 

In the following section, the open domain problem is formulated in terms of the 
finite element functional. The final system is obtained by setting the first variation in 
the functional to zero and then a Rayleigh-Ritz minimization is performed to arrive 
at the final answer. 

4.4.1 Derivation of finite element equations 

Let us consider the problem of scattering by an inhomogeneous target associated 
with possible material discontinuities. To solve for the scattered fields via the FEM, it 
is necessary to enclose the scatterer- embedded inside the volume V- by an artificial 
surface S 0 on which the ABC is enforced (see Figure 4.3). The ABCs to be considered 
in this chapter axe the Sommerfeld radiation condition given by 

n x V x E* = -jk 0 n x n x E a (4.14) 

and the second-order ABC [49] which can be written as 


n x V x E s = aE* + /?V x [n(V x E 4 )„] -f /?V t (V • E*) 


(4.15) 
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Figure 4.3: Illustration of scattering structure Vj enclosed by an artificial mesh ter- 
mination surface, S 0 , on which the absorbing boundary condition is im- 
posed. 

where a = jk,/3 = 1/(2 jk + 2/r), E* represents the scattered electric field, n is 
the unit normal to the surface S 0 and the subscripts t and n denote the transverse 
and normal component to S 0 , respectively. When these ABCs are employed on the 
artificial boundary S 0 , they annihilate all field terms of 0(r - ^ 2m+1 ^) and smaller, 
where m denotes the order of the ABC. The ABCs outlined above were derived for 
spherical surfaces [49] but in this work we have extended their application to S 0 
which include flat sections. In this case, the local curvature is used to replace 1/r 
in (4.15). For flat sections, the 1/r term, therefore, reduces to zero. This permits 
the construction of termination boundaries conformal to the scatterer, thus reducing 
the size of the the computational domain. A detailed derivation of (4.15) as well as 
other more general ABCs are outlined in Chapter 6. 

The vector ABCs (4.14) and (4.15) can be combined and more conveniently writ- 




59 


ten as 

nxVxE’ = P{E S ) (4.16) 

for the scattered field formulation in which E* is the working variable and 

nxVxE = P{ E) + U’ nc (4.17) 

for the total field formulation where the unknown is the total electric field. In (4.17), 

U‘ nc = n x V x E‘ nc - P (E ,nc ) (4.18) 

where E = E* + E mc is the total field and E mc is the incident electric field. Con- 
sidering (4.17) to be the boundary condition employed at S 0 , we can express the 
functional for the total electric field as 

F(E) = f — (V x E) ■ (V x E) - Jt^ r E • E dV 

JV [fl r 

+ [ [e • P(E) + 2E • U <nc ] dS (4.19) 

J So 

where e r and /x r are the relative permittivity and permeability, respectively. 

The above functional can be generalized to account for the presence of impedance 
and resistive sheets or other discontinuous boundaries. In the case of a resistive card, 
the transition condition [51] 

n x (n x E) = -Rn x (H + - H") (4.20) 

must be enforced, where H ± denotes the total magnetic field above and below the 
sheet, R is the resistivity in Ohms per square and n is the unit normal to the sheet 
pointing in the upward direction (+ side). For an impenetrable impedance surface, 
the appropriate boundary condition on that surface is 


n x (n x E) = —ijh x H 


(4.21) 
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dV 


(4.22) 


where n is the unit normal to the surface and rj is the surface impedance. Taking 
into consideration these boundary /transition conditions, the functional for the total 
electric field can be more explicitly written as 

F( E) = J v ^-(V xE)-(VxE)-**e,E-E 
+jk 0 Z 0 J s (n x E) • (n x E) dS 
+ / [E • P{ E) + 2E • U"* c ] dS 

J Sq 

where K is the surface resistivity (R) when integrating over a resistive card and 

equals the surface impedance (rj) for an impedance sheet. 

In order to deal with anisotropic scatterers, the functional outlined in (4.22) 
undergoes a slight modification since the material properties of the scatterer (perme- 
ability and permittivity) are now second rank tensors rather than scalars. Equation 
(4.22) can therefore be written as 

F(E) = /[(VxE)-{pl''-(VxE)}- k\ E • {? ■ E}] dV 

J V 


(4.23) 


where 


and 


+jk 0 Z 0 J (n x 

E) 

• (n x 

+ / [E ■ P(E) -1- 2E • 

J So 
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The symmetry of the final system of equations now depends on the symmetry of the 
permeability and the permittivity tensors. 

The formulation presented above is in terms of the total field but we can easily 
revert to a scattered field formulation by setting E a = E — E mc and noting that the 
scattered field satisfies the wave equation inside the domain of interest. 

Let us consider the case where the computational volume V is occupied by a 
dielectric structure and is bounded internally by the surface of a perfect conductor 
and externally by the mesh termination boundary. On examining the terms inside 
the volume integral in (4.19), we can define 


L 


— (VxE)-(VxE)-A: 0 2 e r E-E 

V-T 


dV = G(E,E) 


(4.26) 


Expressing the above relation in terms of the incident and the scattered fields, we 
have 


G(E, E) = G(E a , E*) + 2G r (E*,E* nc ) + G(E inc , E’ nc ) (4.27) 


The first and the third terms on the RHS of (4.27) cannot be simplified any further. 
The second term will, however, lend itself to more simplification. Making use of a 
simple vector identity and the divergence theorem, we can rewrite G(E*,E mc ) as 


G(E S , E' nc ) = [ E a Vx- VxE* nc - ifc 2 e r E mc 

JV [ H T 

- j s E* • (n x VxE ,nc ) dS 


dV 


(4.28) 


/ k U-(VxE')(VxE'“) 

/ E 

JV 


dV = 


Vx— VxE ,nc 

(*r 


dV - Js J-E* ■ (n x V xE ,nc ) dS (4.29) 


since 
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and the surface integral cancels out everywhere inside the computational domain 
except on the mesh termination boundary S 0 ■ If we define V d to be the volume 
occupied by dielectric materials, then the remaining volume (V 0 = V - V d ) is the 
volume occupied by free space. On incorporating this into (4.28), we have 


G(E*,E‘ nc ) = J E* • [VxVxE mc - fc„E ,nc ] dV 

+ / E* 

Jv d 

- f E* • (n x V xE’ nc ) dS 

J So 


V x —V xE' nc — A:*e r E' 

Mr 


dV 


(4.30) 


Since the incident electric field satisfies the wave equation in free space, the first 
term of (4.30) is identically zero. The third term cancels exactly with the cross term 
r E • U‘ nc dS in the total field functional (4.19). The second term can be simplified 

J 

by employing a standard vector identity and the divergence theorem to yield 


dV = 


[ E* • V x — V x E‘ nc - kltrE* 

JV d [ Mr 

f — (VxE*) • (VxE’ nc ) - k 2 0 e r E* • E‘ 

Jv d Mr 

+jk 0 Z 0 / — E* • (n x H‘ nc ) dS 
Js d Mr 


dV 


(4.31) 


where the normal to S d is directed away from V d . The surface integral over the 
dielectric interface S d occurs since the tangential component of the scattered elec- 
tric field is discontinuous over the interface between two dielectrics having dissimilar 
permeabilities. It should be noted that (4.31) holds good even when there are mul- 
tiple dielectric regions present. If the dielectric regions have the same permeability 
(fi r _ = . . . fi rn = 1, for example) and different permittivities, the surface inte- 

gral contribution over the dielectric interfaces -S di , . . . , S dn - is zero. If different per- 
meability values are also present, then the permeability values must be substituted 
into the element equations and the direction of the normal for the two elements on 
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the interface should take care of the respective signs. Therefore, G(E 4 , E‘ nc ) reduces 
to 


/ — (VxE 4 ) • (V xE ,nc ) - Jfc*e r E 4 • E ,nc dV 

JV d Hr 

+jk 0 Z 0 f — E* • (n X H inc ) dS 

JS d Hr 

- f E 4 • (n x VxE ,nc ) dS 

J Sq 


The impedance and resistive sheet boundary conditions can be incorporated in a 
similar way into the scattered field functional. After simplification, the functional 
F( E 4 ) for the scattered field is then given by 


F(E 4 ) = / —(V x E 4 ) • (V x E 4 ) - fc’e r E 4 • E 4 dV 

JV [fir 

+jk o Z 0 [ (n x E 4 ) • (n x E 4 ) dS 
Js k K 

+ f E 4 • P{E')dS 
Js 0 

+2jk 0 Z 0 [ — E 4 ■ (n x H‘ nc ) dS 
Js d Hr v ' 

+2 / I— (V x E 4 ) • (V x E‘ nc ) - fc^c r E 4 • E‘ nc dV 

JVd [Hr 

+2 jk D Z 0 [ hn x E 4 ) • (n x E inc )dS 
Js k K 

+f (E mc ) (4.33) 


where Vj, is the volume occupied by the dielectric (portion of V where t T or /x r are 
not unity), Sd encompasses all dielectric interface surfaces and 

f E 4 • P(E a )dS = [ [a (E 4 ) 2 + /?(V x E 4 )^ — ■ E 4 ) 2 ] dS 

J So * So 

when the second order ABC is employed. The function / (E‘ nc ) is solely in terms 
of the incident electric field and vanishes when we take the first variation of F(E 4 ). 
We remark that the scattered field formulation was implemented in our code. 
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4.5 Finite element discretization 

To discretize the functional given in (4.33), the volume V is subdivided into a 
number of small tetrahedra, each occupying the volume V e (e = 1, 2, • • • , M), where 
M denotes the total number of tetrahedral elements. Within each element, the 
scattered electric field is expressed as 

E' = £ £;w; = {W*) T {£'} = {£*} T {w*} (4.34) 

j=l 

where W* are the edge-based vector basis functions [25], E] denote the expansion 
coefficients of the basis and represent the field components tangential to the ;th 
edge of the eth element, m is the number of edges making up the element and 
the superscript stands for the element number. The basis functions used in our 
implementation have zero divergence and constant curl. 

The system of equations to be solved for E? is obtained by a Rayleigh-Ritz 
procedure which amounts to differentiating F with respect to each edge field and 
then setting it to zero. On substituting (4.34) into (4.33), taking the first variation 
in F and assembling all M elements, we obtain the following augmented system of 
equations 

f or? 'I M M. M P 

+ £< c ’') = 0 < 435) 
oE e ) e=1 ,=1 P =1 

In this, M a denotes the number of triangular surface elements on S* and S a and M p 
is equal to the sum of the surface elements on S k , S d and the volume elements in V d . 
The elements of the matrices [A 8 ], [ B •] and {C p } are given by 

A'.. = I [— ( V x W') ■ (V x W") - kl€ r W' ■ W' iV 

,J JV • L/i r J 

£*, = A zJ/ l(nx w -)-(nxW - )<i5 
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+ jja W’ t • W' t + 0(V X Wf)n • (V X W*) n - 0(V • WJ f )(V • W‘,)]dS 
Cf = 2jfc 0 z 0 [ / — W? • (n X H ,nc )<f5 + f ^(n x W?) • (n x E' nc )dS 

+2 / f— (V x W?) • (V x E* nc ) - Jb 0 2 c r Wf • E ,nc dV 
JV* [fir J 

where Vf is the volume of the pth tetrahedron inside the dielectric, S’ and S p repre- 
sent the surface area of the sth and pth triangular surface element and the subscripts 
t and n denote the tangential and normal components of a vector, respectively. The 
boundary condition n x E s = -n x E mc must be imposed a priori on metallic bound- 
aries; however, no special treatment is required at material discontinuities. Only the 
identification of the edges on material discontinuities or inhomogeneities is required 
to kick in the contribution from the surface integrals in (4.33). 

The biconjugate gradient algorithm with diagonal preconditioning was used to 
solve the sparse, symmetric system of equations. The residual norm was usually 
set to less than 0.1% of the solution norm as a criterion for convergence since lower 
tolerances did not appear to offer significant improvement on the far-field values. The 
data structure was constructed such that only the non-zero elements of the upper 
triangular part of the symmetric, sparse matrix were stored in & N a x k complex 
array. In our case, N a was typically 1.1 x N u , where JV U denotes the number of 
unknowns and k was equal to 12. The corresponding addresses were stored in a 
separate N a x k integer array. The storage required in this scheme was about 25 JV U 
and the number of distinct non-zero elements was typically 9 N u . 

4.6 Results 

A computer program was written for implementing the proposed FE-ABC formu- 
lation. This implementation was validated by computing the scattering for several 
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Figure 4.4: Bistatic echo-area of a perfectly conducting cube having edge length of 
0.755A. Plane wave incident from 6 = 180°; <f> = 90°. 

configurations including metallic and dielectric bodies as well as structures satisfy- 
ing resistive and impedance boundary conditions. Figure 4.4 compares the measured 
[52] bistatic cross-section ( 6 inc = 180°, <f> tnc = 90°) of a metallic cube having an edge 
length of 0.755A with the corresponding pattern computed by the three-dimensional 
FE-ABC code. The second-order vector ABC was employed on a spherical mesh 
truncation boundary which was placed only 0.1A from the edge of the cube. About 
33,000 unknowns were used for the discretization of the computational domain and 
the [A] matrix contained a total of 264,000 distinct non-zero entries. The storage 
requirement of this matrix was consequently much smaller than that of the 1400 
unknown moment method system (assuming the same sampling rate as the FEM of 
14 points/ A) which had 2 million non-zero entries. 

In Figure 4.5, we plot the normal incidence backscatter RCS of a perfectly con- 
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Figure 4.5: Backscatter RCS of a perfectly conducting cube at normal incidence as 
a function of edge length 

ducting cube as a function of its edge length. The meshes constructed for this 
experiment were terminated on conformal boundaries, i.e, on another cube placed a 
small distance (more than 0.15A) from the scatterer. As seen, the agreement with 
measured data [52] is remarkably good over a 50dB dynamic range. 

Figure 4.6 presents backscatter data for a cylinder of radius 0.3A and height 0.6A. 
The data from the three-dimensional finite element code again compare well with that 
obtained from a moment method-body of revolution code. The mesh was terminated 
on a spherical boundary at a distance of 0.3A from the edge of the scatterer and the 
system consisted of nearly 33,000 unknowns. Convergence was achieved within about 
350 iterations when the Sommerfeld radiation condition was imposed on the spherical 
mesh termination boundary. Each iteration took approximately 0.1 seconds on a 
Cray YMP after vectorization and on the average it was found that for N > 25, 000, 
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Figure 4.6: Backscatter pattern of a perfectly conducting cylinder having a radius of 
0.3A and a height of 0.6A. The solid and the dashed lines indicate data 
obtained from a body of revolution code and the black and white dots 
indicate FE-ABC data. 

the number of required iterations were approximately N/ 100. The agreement was 
quite good even on enclosing the metallic cylinder with a rectangular outer boundary 
placed 0.3A from the edge of the scatterer. 

The results presented till now have been for perfectly conducting geometries. 
However, the real advantage of the FEM over integral equation techniques is the ease 
with which the former can handle material inhomogeneities and transition conditions. 
With this in mind, the remaining figures show backscatter and bistatic patterns for 
scatterers comprised of resistive cards, dielectric material and combinations of these. 
The first geometry that we tested was that of a homogeneous dielectric sphere having 
a relative permittivity of 4 and a radius of 1/27T. The bistatic pattern of the geometry 
is compared to that obtained using a CG-FFT formulation (Figure 4.7) and is seen 
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Figure 4.7: Bistatic echo-area of a homogeneous dielectric sphere (e r = 4; k 0 a =1). 

to agree remarkably well. The finite element mesh was terminated only 0.3A from 
the dielectric body. 

Another of the test cases was a prolate spheroid shown in Figure 4.8 filled with 
lossy dielectric having e r = 4— jl, k 0 a = tt / 2 and a/b = 2, where a and b are the major 
and minor axes of the spheroid, respectively. The bistatic pattern ( 0 ,nc = 180°; <j>' nc = 
90°) obtained from the F E-ABC solution agree reasonably well with those obtained 
via the hybrid finite element-boundary integral method presented in [42]. However, 
the corresponding convergence rate for non-metallic bodies and resistive/impedance 
sheets was found to be slower than that observed for metallic scatterers. A diagonal 
preconditioner was, therefore, used to accelerate the convergence of the biconjugate 
gradient algorithm with encouraging results. 

For our last example, we compute the scattering from an inhomogeneous geom- 
etry with embedded resistive cards. Particularly, the scatterer shown in Figure 4.9 
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Figure 4.8: Normalized bistatic pattern of a lossy prolate spheroid (e r = 4 — \ k 0 a — 
n/2-,a/b = 2), where a and b are the major and minor axes of the 
spheroid, respectively 

consists of an air-filled resistive card block (0.5A x 0.5A x 0.25A) joined to a metallic 
block (0.5A x 0.5A x 0.25A). In Figure 4.10, we compare a principal plane backscatter 
pattern obtained from our 3D FE-ABC implementation with data computed using a 
traditional moment method code [53] for both polarizations. The computed data is 
again seen to follow the reference data closely. For the FE-ABC solution, the scat- 
terer was enclosed within a cubical outer boundary placed only 0.3A away from the 
scatterer. This resulted in a 30,000 unknown system which converged to the solution 
in about 400 iterations when using the Sommerfeld radiation condition and in 1600 
iterations when the second order ABC was used. For this geometry, the second order 
ABC did not provide a significant improvement in accuracy (only about O.ldB) over 
the first order condition. The same case was run with a higher discretization result- 
ing in a system of 50,000 unknowns; however, there was no significant difference in 
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Figure 4.9: Geometry of cube (a = b = 0.5A) consisting of a metallic section and a 
dielectric section (e r = 2 — j2), where the latter is bounded by a resistive 
surface having R = Z 0 . 

the far-field values with the earlier case. The geometry for the backscatter pattern 
shown in Figure 4.11 is the same as in Figure 4.9 with the air- filled section now 
occupied by a lossy dielectric having e T = 2 — j2. The backscatter echo-area pattern 
for the <f><f> polarization as computed by our FE-ABC code is again seen to be in good 
agreement with corresponding moment method data [53]. 

4.7 Conclusions 

In this chapter, we have shown that the finite element technique with vector ba- 
sis functions, when coupled with ABCs for mesh termination and the biconjugate 
gradient algorithm for the solution of the resulting system, is a viable procedure for 
computing the scattering by three-dimensional targets. We have found that these 
ABCs can be enforced only a small fraction of a wavelength from the scatterer’s 
surface. This is probably due to the fast (1/r) decay of the scattered fields. As 
a result, in addition to the sparsity of the matrix, the total number of unknowns 
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Figure 4.10: RCS pattern in the x - z plane for the composite cube shown in Figure 
4 9 The lower half of the cube is metallic while the upper half is air- 
filled with a resistive card draped over it. 

is kept under control. Further, due to the use of edge elements, the program can 
easily handle sharp conducting edges and tips, inhomogeneous dielectric and/or mag- 
netic materials, resistive sheets and impedance surfaces. These, in conjunction with 
the well-known advantages of the finite element method, result in low 0{N) stor- 
age requirement, making the computation of large body scattering possible. These 
capabilities along with the ease in modeling arbitrary geometries, makes this formu- 
lation, to the best of our knowledge, one of the first suitable for solving practical 
three-dimensional scattering problems. 




c/X\ in dB 
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Figure 4.11: RCS pattern in the x — z plane for the composite cube shown in Figure 
4.9. The composition of the cube is the same as in Figure 4.10, except 
that the air-filled portion is filled with dielectric. The solid curve is the 
FEMATS pattern and the black dots are MoM data for the £‘ nc = 0 
polarization. 




CHAPTER V 


Optimization and parallelization 


In the previous chapter, we laid the foundation for our methodology by outlining 
the formulation of the finite elment system together with the absorbing boundary 
condition method of mesh termination and presenting some examples to validate 
our solution technique. We found that the FE-ABC technique yielded accurate 
far-field values for small geometries, i.e., structures whose dimensions are less than 
a wavelength. However, our principal motivation was to compute scattering from 
large, three dimensional structures having arbitrary material inhomogeneities and 
regions satisfying impedance and/or transition conditions. As mentioned earlier, the 
number of unknowns escalates rapidly in three dimensions as the target size increases. 
Therefore, the limiting factor in dealing with three dimensional problems is the 
unknown count and the associated demands on storage and solution time. Solution 
techniques which have 0{N) storage and feasible solution times are thus the only 
way that the curse of dimensionality can be avoided. This is one of the principal 
reasons for the popularity of partial differential equation techniques over integral 
equation (IE) approaches which lead to dense matrices and 0(N 2 ) storage. As the 
problem size increases, the IE and hybrid methods, both of which need O(JV'), 1 < 
l < 2, storage, quickly become unmanageable in terms of storage and solution time. 


75 


Another concern while solving problems having more than 100,000 unknowns - a 
scenario that can be envisioned for most practical problems - is to avoid software 
bottlenecks. The algorithmic complexity of any part of the program should increase 
at most linearly with the number of unknowns. 

In this chapter, the implementation details of our finite element code are presented 
along with the associated numerical considerations. The various trade-offs associ- 
ated with the data structures used to represent sparse matrices and their impact on 
vectorization and parallelization are discussed. The iterative solver, a preconditioned 
biconjugate gradient (BCG) algorithm, is studied along with point and block precon- 
ditioning strategies and the trade-offs between the two types of preconditioners are 
outlined. A modified incomplete LU (ILU) preconditioner is presented, which seems 
to work better than the original ILU preconditioner for our matrix systems. Itera- 
tive solvers for unsymmetric matrix systems are also mentioned to handle anisotropic 
geometries and situations where the mesh termination condition makes the system 
unsymmetric. In order to facilitate the solution of large problems, the computation- 
ally intensive portions of the finite element code have been parallelized on a variety 
of massively parallel architectures like the KSR1 (Kendall Square Research) and the 
Intel iPSC/860. A full analysis of the communication patterns is also presented for 
the KSR1 machine. 

5.1 Numerical considerations 

The finite element code implemented by the authors can be divided into four 
main modules: 

• Input /output 

• Right-hand side vector (b) generation 
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• Finite element matrix (A) generation 

• Linear equation solver 

The input to the program consists of the mesh information obtained by pre-processing 
the mesh file generated from SDRC I-DEAS, a commercial CAD software package. 
The right-hand side vector, b, is usually a sparse vector and only a small fraction 
of the total CPU time is required to generate it. The finite element matrix genera- 
tion consists of too many subroutine calls and highly complex loops to permit any 
significant speedup through vectorization. It is, however, highly amenable to paral- 
lelization as will be discussed later. The most time-consuming portion of the code 
is the linear equation solver, taking up approximately 90% of the CPU time. On 
a vector computer like the Cray YMP, it is possible to vectorize only the equation 
solver. However, short vector lengths and indirect addressing inhibit large vector 

speedups. 

5.2 Matrix storage and generation 

The matrix systems arising from I-DEAS were very sparse: on the average, the 
minimum number of non-zero elements per row was 9 and the maximum number of 
non-zeros per row was 30. The total number of non-zeros varied between 15 N and 

16N, where N is the number of unknowns. 

There are various storage schemes for sparse matrices. In this chapter, we will 
discuss the ITPACK format [54], the jagged diagonal format and the Compressed 
Sparse Row (CSR) format. Knowledge of the storage formats is important since the 
speed of computation on vector or parallel processors is directly linked to the data 



structure used for matrix storage. 
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In the ITPACK storage scheme, a sparse matrix A of order N is stored using two 
arrays T> and VC. For example, if we have the 5x5 unsymmetric matrix A 

3 0 0 4 5 

7 0 4 0 2 

A = 4 0 7 0 0 

0 0 8 0 0 

9 7 0 0 0 

Then, according to the ITPACK scheme, the rows of the array V will contain the 
non-zero elements of the corresponding rows of the original matrix. The number of 
columns of V will be equal to the maximum number of non-zeros in a row; rows 
containing fewer non-zero elements will be zero padded. The array V thus looks like 

3 4 5 

7 4 2 

V= 4 7 0 

8 0 0 

9 7 0 

The column indices of the elements in V are stored in an integer array VC defined 
as 

1 4 5 

1 3 5 

PC = 1 3 * 

3 * * 

1 2 * 

The asterisk denotes that the corresponding elements of V are zeros. The ITPACK 
storage scheme is attractive for generating finite element matrices since the number 
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of comparisons required while augmenting the matrix depends only on the locality of 
the corresponding edge and not on the number of unknowns. Moreover, the sparse 
matrix- vector multiplication process can be highly vectorized because of large vector 
lengths when the number of non-zeros in all rows is nearly equal. However for our 
application, almost half the space is lost in storing zeros. As a result, a lot of 
storage as well as computational effort is wasted in storing and operating on zeros, 
respectively. 

The modified ITPACK scheme [55] does alleviate this problem to a certain de- 
gree by sorting the rows of the matrix by decreasing number of non-zero elements. 
However, 30% of the allotted space is still lost in zero padding. 

The best trade-off between storage and speed for our application on parallel 
architectures is obtained by storing the non-zero matrix elements in a long complex 
vector, the column indices in a long integer vector and the number of non-zeros per 
row in another integer vector. On a vector machine, the jagged diagonal storage 
scheme gives better results in terms of vectorizability. This scheme will be explained 
in the next section. 

The data structure used for storing the sparse matrix on MPP machines is referred 
to as the Compressed Sparse Row (CSR) format. A similar data structure which 
stores the row indices instead of the column indices is called the Compressed Sparse 
Column (CSC) format. The CSC format is sometimes used when the matrix is 
to be accessed along the rows and not the columns, e.g., in the multiplication of 
the transpose of a sparse matrix with a vector. In our implementation, a map of 
the number of non-zeros for each row is obtained through a simple pre- processor. 
The main program stores the matrix in CSR format, thus minimizing storage and 
sacrificing a bit of speed. The required storage is 151V to \6N complex words plus 
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integers for V and VC , respectively, and N integers for the array containing the 
pointers to the rows’ data. 

5.3 Linear equation solver 

In three dimensional applications, the order N of the system of linear equations 
may be very large. Direct solution methods usually suffer from fill-in to an extent 
that these large problems cannot be solved at a reasonable cost even on state-of- 
the-art parallel machines. It is, therefore, essential to employ solvers whose memory 
requirements are a small fraction of the storage demand of the coefficient matrix. 
This necessitates the use of iterative algorithms instead of direct solvers to preserve 
the sparsity pattern of the finite element matrix. Especially attractive are iterative 
methods that involve the coefficient matrices only in terms of matrix- vector products 
with A or A T . The most powerful iterative algorithm of this type is the conjugate 
gradient algorithm for solving positive definite linear systems [56]. In our implemen- 
tation, the system of linear equations is solved by a variation of the CG algorithm, 
the biconjugate gradient (BCG) method. This scheme is usually used for solving 
unsymmetric systems; however, it performs equally well when applied to symmetric 
systems of linear equations. For symmetric matrices, BCG differs from CG in the 
way the inner product of the vectors are taken. 

The conjugate gradient squared (CGS) algorithm [57] performs best when applied 
to unsymmetric systems of linear equations. It is usually faster than BCG but is more 
unstable since the residual polynomials are merely the squared BCG polynomials and 
hence exhibit even more erratic behavior than the BCG residuals. Moreover, there 
are cases where CGS diverges, while BCG still converges. Recently, Freund [58] has 
proposed the quasi-minimal residual (QMR) algorithm with look-ahead for complex 
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symmetric matrices. 

Based on the above, the biconjugate gradient (BCG) algorithm was found to be 
most suitable for our implementation. The BCG requires 1 matrix- vector multiplica- 
tion, 3 vector updates and 3 dot products per iteration. The solution scheme requires 
only three additional vectors of length N. The vector updates and the dot products 
can be carried out extremely fast on a vector Cray machine like the Cray YMP , 
reaching speeds of about 190 MFLOPS. However, the matrix-vector product, which 
involves indirect addressing and short vector lengths, runs at about 45.5 MFLOPS 
on 1 processor of the 8-processor Cray YMP. As a rule of thumb, the biconjugate 
gradient algorithm with no preconditioning consumes 4.06 microseconds per iteration 
per unknown on the Cray YMP. 

As mentioned earlier, there are two problems which limit the vectorizability of 
a sparse matrix code - short vector lengths and indirect addressing. There is not 
much one can do about the second problem since sparse matrices must have indirect 
addressing to exploit the O(N) storage feature. However, the first problem can be 
removed by storing the matrix in a different format such that the vector lengths are 
approximately equal to the order of the system being solved. The storage format is 
called the jagged diagonal format [59]. The rows are ordered by decreasing degree 
and the leftmost elements of each row are stored as a dense vector with an additional 
vector indicating the column numbers of each element. The matrix is thus stored 
as a collection of vectors of decreasing length. The inner loop of the matrix-vector 
multiplication routine traverses the entire length of a jagged diagonal, which can be of 
the order of the system being solved. This feature enhances vectorization massively. 
The storage requirement of the above format can be made to be the same as the 
previously mentioned CSR format through careful programming. The altered code 
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then runs at around 275 Mflops on a Cray C-90. The dot product reaches speeds 
of 550 Mflops and the vector updates execute at 600 Mflops. It must be mentioned 
that the CRAY C-90 is a substantially faster machine than the Cray YMP but the 
CSR formatted matrix-vector multiplication routine runs about 4 times slower on 
the C-90. Therefore, we can reliably state that the method of jagged diagonals is the 
best sparse matrix storage scheme in terms of computer storage and vectorizability. 
The still slower execution speeds of the matrix-vector multiply compared with the 
vector update is due to the indirect addressing in the inner loop which causes memory 
contention. 

5.4 Preconditioning 

The condition number of the system of equations usually increases with the num- 
ber of unknowns. It is then desirable to precondition the coefficient matrix such that 
the modified system is well-conditioned and converges in significantly fewer iterations 
than the original system. The equivalent preconditioned system is of the form 

C_1 ][A]{x} = [C- 1 ] {b} (5.1) 

The non-singular preconditioning matrix C must satisfy the following conditions: 

1. should be a good approximation to A. 

2. should be easy to compute. 

3. should be invertible in O(N) operations. 

The preconditioners that we discuss below are the diagonal and the ILU point 
and block preconditioners. Block preconditioners are usually preferable due to re- 
duced data movement between memory level hierarchies as well as decreased number 
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of iterations required for convergence. Block algorithms are also suited for high- 
performance computers with multiple processors since all scalar, vector and matrix 
operations can be performed with a high degree of parallelism. 

5.4.1 Diagonal preconditioner 

The simplest preconditioner that was used in our implementation was the point 
diagonal preconditioner. The preconditioning matrix C is a diagonal matrix which 
is easy to invert and has a storage requirement of N complex words, where N is the 
number of unknowns. The entries of C are given by 

Cij = SiiAii, i = 1, . . . , JV; j = (5.2) 

where 6^ is the Kronecker delta. The matrix C -1 contains the reciprocal of the 
diagonal elements of A. The algorithm with the diagonal preconditioner converged 
in about 35% of the number of iterations required for the unpreconditioned case. This 
suggested that our finite elment matrix was diagonally dominant since the reduction 
in the number of iterations was rather impressive. The diagonal preconditioner is 
also easily vectorizable and consumes 4.1 microseconds per iteration per unknown 
on the Cray YMP, a marginal slowdown over the unpreconditioned system. 

A more general diagonal preconditioner is the block diagonal preconditioner. The 
point diagonal preconditioner is a block diagonal preconditioner with block size 1. 
The block diagonal preconditioning matrix consists of m x m symmetric blocks as 
shown in Figure 5.1. The inverse of the whole matrix is simply the inverse of each 
individual block put together. If the preconditioning matrix C is broken up into n 
blocks of size m, the storage requirement for the preconditioner is at most m x N. 
However, this method suffers a bit from fill-in since the inverted m x m blocks are 
dense even though the original blocks may have been sparse. Due to this reason, 
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Figure 5.1: Structure of block preconditioning matrix 

large blocks cannot be created since the inverted blocks would lead to full matrices 
and take a significant fraction of the total CPU time for inversion. However, since 
the structure of the preconditioning matrix is known a priori, this preconditioner 
vectorizes well and runs at 194 MFLOPS on the Cray-YMP for a block size of 8. For 
a test case of 20, 033 unknowns, a block size of 2 caused the maximum reduction in 
the number of iterations(14%) and ran at 197 MFLOPS. 

5.4.2 Modified ILU preconditioner 

The next step was to use a better preconditioner to improve the condition number 
of the system resulting in faster convergence. The traditional ILU preconditioner 
[60] was employed with zero fill-in; however, the algorithm took a greater number of 
iterations than the diagonal preconditioner to converge to a specified tolerance. This 
was probably because the ILU preconditioned system may not have been positive 
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definite [61]. The preconditioned conjugate gradient method usually converges faster 
if the preconditioner is positive definite, although this is not a necessary condition. 
Higher values of fill-in were not attempted since the preconditioner already occupied 
storage space equal to that of the coefficient matrix. 

Algorithm 1 : Modified ILU preconditioner with zero fill-in 

It is assumed that the data is stored in CSR format and that the column numbers 
for each row are sorted in increasing order. The sparse matrix is stored in the vector 
V and the column numbers in VC. SlQ{i) contains the total number of non-zeros 
till the ith row. The locations of the diagonal entries for each row are stored in the 
vector VI AQ. The preconditioner is stored in another complex vector, CU. 

for i=l step 1 until n-1 do 
begin 

lbeg=diag(i) 

lend*sig(i) 

for j=lbeg+l step 1 until lend do 
begin 

jj=P c Cj) 

ij*srch(j j ,i) 
if (ij.ne.O) then 
begin 

lu(ij)“lu(ij)/lu(lbeg) 

end 

end 




<7 


end 
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A modified version of the ILU preconditioner was next employed by eliminat- 
ing the inner loop of the traditional version. The algorithm basically scales the 
off-diagonal elements in the lower triangular portion of the matrix by the column di- 
agonal. Since the matrix is symmetric, it retains the LDL T form and is also positive 
definite if the coefficient matrix is positive definite. This preconditioner is less ex- 
pensive to generate and converges in about 1/3 the number of iterations taken by the 
point diagonal preconditioner. It has been tested with reliable results for N < 50000. 
However, the time taken by the two preconditioning strategies is approximately the 
same since each iteration of the ILU preconditioned system is about three times more 
expensive. The forward and backward substitutions carried out at each iteration runs 
at 26.5 MFLOPS on the Cray YMP and proves to be the bottleneck since they are 
inherently sequential processes and the vector lengths are approximately half that 
of the sparse matrix-vector multiplication process. The triangular solver is also ex- 
tremely difficult to parallelize. Techniques like level scheduling and self scheduling 
try to exploit the fine grain parallelism in the sparse system [62]. 

We implemented the level scheduling algorithm to examine the potential paral- 
lelism in the forward/backward substitution step. For solving any lower triangular 
system Lx = 6, the ith unknown in the forward solution is given by 

^ - E ^ j (5.3) 

If L is dense, all the components x i? . . . ,£.•_! need to be computed before x, can 
be obtained. However, when L is sparse, most of the /,^*s are zero; hence, we may 
not need to compute all of the unknowns xi, . . . , x,-_i before solving for x,-. Level 
scheduling is based on this simple observation. The dependencies between the un- 
knowns can be modeled using a graph in which node i corresponds to the unknown 


C 1. 



86 


Xi and an edge from node j to node i indicates that Uj ^ 0 implying that the value 
of Xj is needed for solving x,. The operation shown in (5.3) can now be rewritten as 




£ hi*. 


(5.4) 


Thus Xi can be solved at the fcth step if all the components Xj in (5.4) have been 


computed in the earlier steps. 

In order to implement the level scheduling algorithm, we need to define the depth 
of a node and the level of the graph. The depth of a node is defined as the maximum 
distance from the root [59]. Therefore, we will place an imaginary root node with 
links to the nodes having no predecessors so that the depth of each node will be 
defined from the same point. The depth of each node can now be computed with 
one pass through the structure of the coefficient matrix L by 


depth(i ) 


r 

1 , 

1 + maij^idepth^) : Uj ^ 0}, 


if Uj = 0 for all j < i 
otherwise 


(5.5) 


The level of the graph can then be defined as the set of nodes with the same depth. 
The level scheduling algorithm can now be implemented without physically ordering 


the matrix, but solving the system in increasing order of node depth and distributing 


the nodes at each depth among the available processors. 


Algorithm 2 ; Forward elimination step with level scheduling 

The number of levels of the graph , nlev, can be easily determined from the depth 
information. Two other integer vectors are also required. ORDER(i) stores the 
ordering of the rows of L in terms of increasing node depth. LEVEL(i) stores the 
index to the start of each level in ORDER(i). 


do k=l, . . . ,nlev 


./ V 
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do j=ilevel(k) , . . . ,ilevel(k+l)-l (parallel loop) 
i=iorder(j) 
execute Equation 5.4 
enddo 
enddo 

However, our efforts at parallelizing the ILU preconditioned system with level schedul- 
ing did not lead to significant speedup mainly due to the enormous amount of mem- 
ory traffic that was generated. This observation was also noticed in [62], where the 
authors estimated that the parallel algorithm generated as much as 10 times more 
traffic than the sequential code. In order to look for an effective parallelizable ILU 
preconditioner, we next turned our attention to a block ILU preconditioner. 

In our scheme of implementing the block ILU preconditioner, we distribute one 
block to each processor in a multi-processor architecture, thus achieving load balanc- 
ing as well as minimizing fill-in. The modified ILU decomposition outlined earlier 
is then carried out on each of these individual blocks. Further, since the blocks are 
much larger than the block diagonal version, the preconditioner is a closer approxi- 
mation to the coefficient matrix. Moreover, the triangular solver is fully parallelized 
since each processor solves an independent system of equations through forward and 
backward substitution. In our test case of 20, 033 unknowns, the number of iterations 
was reduced by approximately half the number required by the diagonal precondi- 
tioner. Since the work done is less than twice that for the diagonal preconditioner, we 
achieved a marginal savings of CPU time. However, the number of iterations required 
for convergence is highly sensitive to block size as shown in Table 5.1 for N = 20, 033. 
Table 5.1 clearly shows that a larger block size (smaller number of blocks) does not 
guarantee faster convergence. Nevertheless, there is an approximately 50% decrease 
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in the number of iterations over the point diagonal preconditioner, regardless of block 
size. The optimum block size is dependent on the sparsity pattern of the matrix and 
can only be determined empirically. The savings in the number of iterations over the 


No. of blocks 

No. of iterations 

i 

127 j 

2 

176 

4 

185 

8 

172 

12 

162 

16 

174 

24 

223 

28 

177 


Table 5.1: No. of iterations vs no. of blocks for a block ILU preconditioned biconju- 
gate gradient solution method. 


point diagonal preconditioner for 28 blocks is given in Table 5.2 for a system having 
224,476 unknowns. From the table, it is clear that the block ILU preconditioner 


Angle of 
incidence 

no. of iterations 

Ratio 

(ii/i) 

point diagonal (I) 

block ILU (II) 

0 

2943 

2758 

.937 

10 

5985 

3834 

.641 

20 

5464 

3984 

.729 

30 

6048 

3651 

.604 

40 

5770 


.564 

50 

5107 


.728 

60 

6517 


.639 

70 

5076 



80 

5305 

3551 

.669 

90 

2898 

2832 

.977 


Table 5.2: No. of iterations required for convergence of a 224,476 unknown system 
using the point diagonal and block ILU preconditioning strategies. 


is very effective in reducing the iteration count; however, the CPU time required is 
about 10% less than that required by the point diagonal preconditioner for the best 


case. 
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5.5 Parallelization 

The different versions of the F E-ABC code were parallelized on two different types 
of massively parallel architectures - the KSR1 and the Intel iPSC/860. The KSRl is a 
parallel machine which implements a shared virtual memory, although the memory 
is physically distributed for the sake of scalability. The Intel iPSC/860, on the 
other hand, is a distributed memory, Multiple Instruction, Multiple Data (MIMD) 
system in which the nodes process information independently of one another and 
communicate by passing messages to each other. The conversion of sequential or 
vectorized code to parallel code involves two primary tasks: 

• parallelization of DO loops 

Parallelism is introduced by allowing each processor to execute a portion of the 
DO loop. 

• distribution of arrays among the processor set 

Since each processor only has a limited amount of memory, each array is divided 
into smaller units that reside on each node. This also allows array accesses from 
each processor to be serviced by different nodes, thus reducing contention for 
resources on any single node. 

On a cache-only memory machine such as the KSRl, only the first step is neces- 
sary since the hardware cache system automatically takes care of data distribution 
among the processors. This makes porting codes to the KSRl quite easy. However, 
the increased control of data distribution and communication on the iPSC/860 can 
translate into improved performance for some applications. We will first describe 
our port and performance figures for the KSRl and then detail our port on the Intel 

iPSC/860. 


1. KSR1 port 

The basic strategy for the parallelization of the code is described on the biconjugate 
gradient solver with diagonal preconditioning. The other versions use the same par- 
allelization scheme with slight modifications. We also comment on the parallelization 
of the matrix assembly phase. 



Complex 

Real 

Operation 

* 

+ 

* 

+ 

Matrix Multiply 



Anze 

4 nze — 2 N 

Vector Updates 



16JV 

UN 

Dot Products 


3JV 

12 N 

Y2.N 


Table 5.3: Floating point operations per iteration. 


The symmetric biconjugate gradient method iteratively refines an approximate 
solution of the given linear system until convergence. Figure 5.2 shows the method 
in terms of vector and matrix operations. For a system of equations containing N 
unknowns, all these vectors are of size N and the sparse matrix is of order N . The 
number of nonzero elements in the sparse matrix is denoted as nze. Table 5.3 shows 
the operation counts per iteration for each type of vector operation. In the FE-ABC 
code, each vector operation is implemented as a loop. The program is parallelized 
by tiling these loops. For P processors, the vectors are divided into P sections 
of N/P consecutive elements. Each processor is assigned the same section of each 
vector. This partitioning attempts to reduce communication while balancing load. 
To guarantee correctness, synchronization points are added after lines 2, 7 , and 9. 
Lines 2 and 7 require synchronization to guarantee that the dot products are com- 
puted correctly. Note that the dot products in lines 6 and 7 require only one syn- 
chronization. The line 9 synchronization guarantees that p is completely updated 
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before the matrix multiply for the next iteration begins. 

In the sparse matrix vector multiplication, each processor computes a block of the 
result vector by multiplying the corresponding block of rows of the sparse matrix with 
the operand vector. Since the operand vector is distributed among the processors, 
data communication is required. The communication pattern is determined by the 
sparsity structure of the matrix, which in our case is derived from an unstructured 
mesh. Therefore the communication pattern is unstructured and irregular. However 
since the sparse matrix is not modified during the iterative process, the communi- 
cation pattern is the same at each iteration. Vector updates and dot products are 
easily parallelized using the same block distribution as in the sparse matrix vector 
multiply. 

Although sparse computations are known to be hard to implement efficiently 
on distributed memory machine, mainly because of the unstructured and irregular 
communication pattern, the previous scheme was easily and efficiently implemented 
on the KSR1 MPP thanks to the global address space [63]. Table 5.4 shows the 
execution time of one iteration (in seconds) and the speedup for different numbers 
of processors and for two problem sizes. 

For both problems, the performance scales surprisingly well up to a large number 
of processors. For the 20,033-unknown problem, the speedup for the parallelized 
sparse solver varies from 1 to 38 as the number of processors is increased from 1 to 
56 (Figure 5.3). The overall performance of the solver on 28 processors is more than 
three times that of a single processor on the Cray-YMP. The large problem (224,476 
unknowns) exhibits superlinear speedup which can be attributed to a memory effect. 
As a matter of fact, the large data set does not entirely fit in the local cache of a 
single node in the KSR which results in a large number of page faults. However, as 
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Initialization: 
x given 


r = b—Ax ; p = r ; tmp = r ♦ r 
Repeat until (read < to/) 


q =Ap 

(1) 

a = tmp/(q • p) 

(2) 

* = * + ap 

(3) 

1 

II 

1 

Q 

(4) 

q = C~ x * r 

(5) 

resd = yj\r • r*| 

(6) 

/? = (r • q)/tmp 

(7) 

trap = ft x tmp 

(8) 

P = q + ftp 

(9) 


| Step 1 


Step 2 


| Step 3 


EndRepeat 


.4 is a sparse complex symmetric matrix. 

C is the preconditioning matrix. 
q,p, se,r are complex vectors. 
a, ft, tmp axe complex scalars; resd,tol are real scalars. 


Figure 5.2: Symmetric biconjugate gradient method with preconditioning. 
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Figure 5.3: Speedup curve for the linear equation solver on the KSRl 

the number of processors increases, the large data set is distributed over the different 
processors’ memories. 

The global matrix assembly is the second largest computation in terms of execu- 
tion time. The elemental matrices are computed for each element in the 3D mesh 
and assembled in a global sparse matrix. A natural way of parallelizing the global 
matrix assembly is to distribute the elements over the processors, have each pro- 
cessor compute the elemental matrix of the elements it owns and update the global 
sparse matrix. Since the global sparse matrix is shared by all processors, the update 
needs to be done atomically. On the KSRl this is done by using the hardware lock 
mechanism. 

The performance for the matrix assembly is given in Table 5.5 and also in Figure 


5.3. 
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Procs 

N=20,033 

N =224, 476 


Execution time 

Speedup 

Execution time 

Speedup 


(secs per iter) 


(secs per iter) 


r 

.515 

1 

10.8 

1 

8 

.071 

7.3 

1.4 

7.7 

16 

.040 

12.9 

.671 

16.1 

29 

.027 

19.1 

.304 

35.6 

60 fc 



.149 

76.2 


Table 5.4: Execution time and speedup for the iterative solver 


5.5.1 Analysis of Communication 

In the main loop (Figure 5.2), significant communication between processors takes 
place only during the sparse matrix vector multiply (line 1) and the vector update of 
p (line 9). The rest of the vector operations incur little or no communication at all. 

The distribution of the nonzero entries in the matrix affects the amount and nature 

“For 1, 8 and 16 processors, only the first 100 iterations were run. 

‘Code run on a 64 node KSR at Cornell University 


Procs 

Execution time 
in seconds 

Speedup 

1 

24.355 

1 1 

2 

13.376 

1.8 

4 

6.811 

3.6 

8 

3.744 

6.5 

16 

1.89 

12.9 

25 

1.625 

15.0 

28 

1.276 

19.1 


Table 5.5: Execution time and speedup for the matrix generation and assembly 
(20,033 unknowns) 

/ A 
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of communication. In this section, we present an analysis of the communication 
pattern incurred by the sparse matrix vector multiplication as derived from analysis 
of the sparsity structure of the matrix. 

Line 1. In the matrix- vector multiply, each processor computes an jV/P-sized 
subsection of the product q. The processor needs the elements of p that correspond to 
the nonzero elements found in the N/P rows of A that are aligned with its subsection. 
Because the matrix A remains constant throughout the program, the set of elements 
of p that a given processor needs is the same for all iterations in the loop. However, 
since p is updated at the end of each iteration, all copies of its element set are 
invalidated in each processor’s local cache except for the ones that the processor 
itself updates. As a result, in each iteration, processors must obtain updated copies 
of the required elements of p that they do not own. 

These elements can be updated by a read miss to the corresponding subpage, by 
an automatic update, or by an explicit prefetch or poststore instruction. Figure 5.4 
lists the number of subpages that each of the 28 processors needs to acquire from other 
processors. Automatic update of an invalid copy of a subpage becomes more likely 
as the number of processors sharing this subpage grows. The number of processors 
that need a given subpage (excluding the processor that updates the subpage) is 
referred to as the degree of sharing of that subpage. Figure 5.5 shows the degree of 
sharing histogram for the example problem. Since the only subpage misses occurring 
in Step 1 of the sparse solver are coherence misses due to the vector p, the use of the 
poststore instruction to broadcast the updated sections of the vector p from Step 3 
should eliminate the subpage misses in Step 1. However, the overhead of executing 
the poststore instruction in Step 3 offsets the reduction in execution time of Step 1 . 
On a poststore, the processor typically stalls for 32 cycles while the local cache is 
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Number of Subpages 



Thread ID 


Figure 5.4: Counts of p subpages required by each processor for sparse matrix- vector 
multiply (total copies=5968) 

busy for 48 cycles. As a result, the net reduction in execution time is only 3%. 

Line 9. Before proceeding with the updates of the N/P elements of p for which 
it is responsible, each processor must acquire exclusive ownership for those elements. 
Because a cache line holds 8 consecutive elements, each processor will generate N /8 P 
requests for ownership (assuming all subpages are shared). In order to hide access 
latencies, the request for ownership can be issued in the form of a prefetch instruction 
after step 1. This could lead to an eightfold decrease in the number of subpage misses. 
However, as with the poststore instruction, the benefit of prefetching is offset by 
the overhead of processing the prefetch instructions in step 2. This is because the 
processor stalls for at least two cycles on a prefetch and the local cache cannot satisfy 
any processor request until the prefetch is put on the ring. The overall execution 
time is reduced by only 4% in this case. 
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Number of Subpages 



Degree of Sharing 

Figure 5.5: Degree of sharing histogram of p subpages during sparse matrix-vector 
multiply (28 procesors) 

Lines 2, 6, 7. The rest of the communication is due to the three dot products. 
Each processor computes the dot product for the vector subsection that it owns. 
These are then gathered and summed up on a single processor. 

2. Intel iPSC/860 port 

The parallelization of the DO loops is one of the main tasks since the majority of the 
computer time is spent on solving the linear system of equations. The basic strategy 
for parallelizing the DO loops on the iPSC/860 is similar to the KSRl - each portion 
executes a portion of the DO loop. This scheme works fine as long as there are no 
dependencies in the body of the loop, as is the case for the vector updates and the 
sparse matrix-vector multiply of the linear solver. However, the main loop in the 
matrix generation/ assembly phase contains a dependency between loop iterations. 
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As on the KSR1, this problem is solved by using a mechanism by which each processor 
locks a row of the matrix while performing an update. Since the locking of each row 
is maintained by the processor whose memory holds the particular row, processors 
lock and unlock rows by sending messages to the appropriate row owner. 

Even though the parallelization of loops enables programs to run faster on multi- 
processors, the distribution of arrays must be done for the code to run at all. Arrays 
are distributed in the code by partitioning one dimension among the processors. 
Thus for a 1000 element array, processor holds the first 100 elements, processor 2 the 
next 100 and so on. The straightforward method for accessing this distributed array 
involves the translation of array references into subroutine calls. Thus an expression 
x = a(i) is translated into the call call fetcha(i,x). The subroutine fetcha then 
sends a message to the processor that holds element a(i), which in turn sends a reply 
message with the value of <z(i). Although this scheme requires the implementation of 
a new subroutine for each distributed array and the replacement of each array access 
with a subroutine call, the process is easy and mechanical. However, such a scheme 
does not result in good performance. 

The primary reason for this is that the overhead for sending a message is much 
higher than that of sending a single byte. The cost for sending 10 or even 100 bytes 
is usually not much higher than that of sending 1 byte. Thus, messages need to be 
‘bundled’ for fast and efficient operation. However, the simple strategy mentioned 
above is in direct contrast to message bundling. Therefore, we have implemented 
the simple scheme for paxts of the code that do not take up a significant portion of 
computation time like the matrix generation/assembly phase and a better scheme 
for accessing the distributed arrays in the equation solver phase. 

The primary operation in the solver that generates communication is the sparse 
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matrix-vector product. Since the matrix-vector product involves performing a dot 
product of each row with the distributed vector, each processor must obtain the 
values for the entire vector from the other processors. The dot product operation 
must be carried out in several phases as each processor may not be able to hold the 
entire vector in memory. Thus each processor P begins the matrix-vector multiply 
by sending its portion of the vector to other processors, then performs the following 
tasks for every other processor P' 

• Reads the portion of the vector owned by P'. 

• Updates the partial dot product for each row by adding the product of the 
appropriate matrix element with the elements of the partial vector. 

After performing the above operations for all the processors, the dot product is 
complete. Unfortunately, each phase requires a pass over all the sparse matrix rows 
owned by the processor. In the future, it may be possible to sort each row of the 
matrix to allow the phases to pass over the rows in order. 

The speedup results on a problem with 31000 unknowns show that the problem 
scales reasonably well for a small number of processors. However, as the number of 
processors increases, much of the time is spent on communication and book-keeping 
than on true computation. Efforts are under way to run a larger problem. 



CHAPTER VI 


Conformal absorbing boundary conditions for the 

vector wave equation 


As mentioned in the previous chapters, the focus of this thesis is the computation 
of scattering from large three-dimensional structures. Since we are dealing with 
large targets having arbitrary shape, a spherical mesh termination boundary is not 
as attractive in terms of storage and computational cost. This is especially true 
for long and thin geometries where a sphere is the least economical shape of mesh 
termination, in terms of the number of unknowns. The ideal situation would be 
to enclose the scatterer inside a mesh termination boundary which is of the same 
shape as the scattering body (see Figure 6.1). If boundary conditions could be 
derived for such conformal mesh truncation surfaces, the volume to be meshed and 
the corresponding computing cost would then be minimized. However, available 
three dimensional ABCs for the vector wave equation as derived by Peterson [48] 
and Webb and Kanellopoulos [49] are only suited for application on spherical mesh 
terminations. 

Our goal, therefore, is to derive new vector ABCs for three dimensional analysis 
which can be applied on a surface conformal to the structure of interest. Wc begin 
with a modified Wilcox expansion whose leading order term recovers the geometrical 
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Figure 6.1: Scatterer enclosed in conformal mesh termination boundary 

optics fields and thus, given the appropriate principal curvatures, the resulting ABC 
completely absorbs all geometrical optics fields from the surface. We then proceed 
to derive the first and second order absorbing boundary conditions in terms of the 
principal curvatures of the surface on which they are employed. We also introduce an 
approximation to make the absorbing boundary condition contribution symmetric. 
In the next step, we incorporate these boundary conditions into the finite element 
equations and express them in a readily implementable form. We also comment 
on the symmetry of the system for doubly curved surfaces. In the last section, we 
examine the performance of these ABCs - in terms of computational cost - when 
applied on mesh termination surfaces conformal to the scattering object. 

6.1 Formulation 

It is known that the electric field in a homogeneous region of space is governed 
by the vector wave equation 

VxVxE - k 2 a E = 0 


( 6 . 1 ) 



where k a is the free-space wave number. We assume that the field has a well-defined 
phase front in the region under consideration. Since we are concerned only with local 
behavior, we can assume that the phase fronts can be treated as parallel regions. 
Consequently, the surface describing the phase fronts can be specified by a net of 
coordinate curves denoted by t\ and <2 and a third variable n denotes the coordinate 
along the normal to the phase front. The point of observation in the Dupin coordinate 
system [64] can now be defined as 

x = nn-f x 0 {t u t 2 ) (6.2) 

where n is the unit normal and x D (f i,< 2 ) denotes the surface of the reference phase 
front. The curl of a vector in the above coordinate system is given by 

3E 

VxE = Vr x E + n x — (6.3) 

on 

where Vj x E is called the surface curl involving only the tangential derivatives and 
is defined as [65] 

V T x E = -n x V£ n + t 2 KiE tl - 1 1 K 2 E t2 + nV • (E x n) (6.4) 


In (6.4), /ci and /c 2 denote the principal curvatures of the surface under consideration, 
Et , , E h are the tangential components and E n is the normal component of the electric 
field on the surface. The principal curvature of a surface is defined as [64] 


11 dh x 

Kl ~W l ~ hi dn 

1 1 dh,2 

K2 = lit 


(6.5) 

( 6 . 6 ) 


where h\, h 2 are the metric coefficients and ill, R 2 are the principal radii of curvature. 
Using the aforementioned coordinates, the Wilcox expansion for a vector radiating 
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function can now be generalized to read 


E(n,li,(2) - / n - R- 2 _ 


(6.7) 


Uy/R, w, 

where Ri = p, + n, i = 1,2 and p t is the principal radius of curvature associated 
with the outgoing wavefront at the target. The lowest-order term in (6.7) represents 
the geometrical optics spread factor for a doubly curved wavefront and reduces to 
the standard Wilcox expansion [66] for a spherical wave. Moreover, (6.7) can be 
differentiated term by term any number of times and the resulting series converges 
absolutely and uniformly [66]. 

6.1.1 Unsymmetric ABCs 


In the 3D finite element implementation using vector basis functions and the 
electric field as the working variable, we need to relate the tangential component of 
the magnetic field in terms of the electric field at any surface discontinuity. Therefore, 
our next task is to derive a relation between n x VxE (i.e., n x H where H is the 
magnetic field) and the tangential components of the electric field on the surface. 
Taking the curl of the electric field expansion given by (6.7) and crossing it with the 
normal vector, we have 


n x V xE = 


s —jk Q n oo 


47 r 


p= o 


(jk 0 4" Km ^ tEpn “h pAC m Epf 


u p+1 


UP+ 1 


(6.8) 


where u = \Afti #2 and 


V t £„ 




V 


— (n x n x V) E n 

+ k 2 
2 

A A A A 

/Cititi 4“ K2t2^>2 
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Considering that £on is zero due to the divergenceless condition [66] and simplifying, 

we obtain the first order absorbing boundary condition 

e ° n ^ tEpn 4” P^mEpi 

n x V xE — (jk Q + K m — r]-) E ( = -7— 1, ^ ^ 

p=l 

or, nx \?xE-(jk 0 + K m -Tj-)E t = 0 + O (n' 3 ) (6.10) 


for a conformal outer boundary. Not surprisingly, this is the impedance boundary 
condition for curved surfaces as derived by Rytov [67]. It should be noted that in the 
above equation, V t £„ and K m are each proportional to n -1 . Therefore, the leading 
order behavior of (6.10) is O (n~ 3 ), i.e., only the first two terms of (6.7) are exactly 
satisfied by (6.10). If the scattered field contains higher order terms, application of 
(6.10) will give rise to non-physical reflections back into the computational domain. 
In order to reduce these spurious reflections, we need to either shift the mesh trun- 
cation boundary farther away from the scatterer or employ higher order boundary 

conditions which satisfy higher order terms of (6.7). 

To reduce the order of the residual error further, we consider the tangential 

components of the curl of (6.9). This yields 


n x V x [n x V xE - (jk 0 + K m - V') E t] = 


e -jk 0 n f Kg 1 V fEpn “h p^m E pt 

£ [jfco + (P + 3)Km- — } „p+l 

p=l L v 

Kg \ VtEpn pK m Tj • E p t 


Air 

w p = 1 


where k 9 = /ci/c 2 is 
simplifying, (6.11) reduces to 


, ( 6 - 11 ) 

u p+i u p+1 

the Gaussian curvature. Using the result derived in (6.9) and 


- (2/c m - 

V K m / 


n x Vx [n x V xE - (jh + K m - q-) E t ] = 

p -jkoTl OO r r . Kg _ 1 V t E pn + pK m E pt ' 

D + (p + 3 ^ m _ ^ v '\ UP+ 1 

p=i 

- (2 « m V t E n 

\ K m ) 


47T 


( 6 . 12 ) 
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If we take a closer look at the term in the square brackets on the RHS of (6.12), we 
find that it can be written as 

V t E pn +p Km E pt 

~k pKm — ^ — 

+ (jk> + 3/c m - — - fj-) {n x VxE - ( jk 0 + n m - fj-) E,} 


where we have substituted 


-,—jkoTl CO 


47T 


^ t^pn "I" 


/P+1 


= n x VxE — (jk a + /c m — fj-) E t (6.13) 


p=l L 

using the relation derived in (6.9). 

Now the dominant terms on the RHS of (6.12) can be eliminated by considering the 
higher-order operator 


n x Vx - (jk Q + 3 K m - — -rj-) {n x VxE - ( jk a + n m - fj-) E<} + 


( K \ p -jk 0 n oo 


YtEpn 4- pn m E * 


4tt UP+ 1 


(6.14) 


The residual of (6.14) can be reduced further to yield the absorbing boundary con- 
dition of second order which satisfies (6.7) to O (n -5 ). This second order ABC is 
found to be 

n x V x - (jk a + 4/c m ~~~ »7-)] {n x VxE - (jk Q + K m - fj-) E t } + 

(2« m - — -fj-) V t E n = 0 (6.15) 

V K m / 


and the residual is equal to 

e -jkon oo 


y^(' D _1'W YlEz n it P K m'Ept 

4tt p y j m « p+i 


(6.16) 


The operator on the LHS of (6.15) can be applied repeatedly to obtain ABCs of 
increasing order; however, higher order basis functions are needed for their imple- 


mentation. 
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After some algebraic manipulation, the terms on the LHS of (6.15) reduced to 
simpler ones. In addition to the wave equation, the following vector identities were 
derived to carry out the simplifications and are provided below for the reader’s con- 
venience: 

nxVxE, = n x V xE — V<£„ 
n x VxV|£ n = V t (V-E < ) + 2« m V 1 E n 

nxVx(nxVxE) = Vx {n (VxE) n } - fc 2 E, - Ak {(VxE) ta t x + (VxE) tj t 2 } 

where Ak = - >c 2 . The derivation of these identities is given in Appendix B. Upon 

simplification, the second order ABC can be compactly written as 

-{D- 2/c m ) n x VxE + {4«^ -k 9 + D (jk a - fj-) + (fj) 2 • +« m A/<C-} E t 

+Vx {n(VxE) n } -I- (jK + 3/c m — - 2fj-\ V t E„ = 0 (6.17) 

in which 

D — 2 jk Q 5/c m 

and 

(rj) 2 • E t = + K 2 E t3 t 2 (6.18) 

£ = tltl “ £2^2 

The second order ABC derived in [48] is recovered on setting k x = k 2 = 1/r. 

6.1.2 Symmetric correction 

It has been shown by Peterson in [48] that the LHS of (6.17) when incorpo- 
rated into the finite element equations gives rise to an unsymmetric matrix system 
in spherical coordinates. To alleviate this problem, Kanellopoulos and Webb [49] 
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suggested an alternative derivation involving an arbitrary parameter which would 
lead to a symmetric matrix while sacrificing some accuracy. Below, we discuss a 
different approach which leads to a symmetric ABC without the introduction of an 
arbitrary parameter. 

On considering the series expansion of the term n x VxV t E n , we have 

f>~jk 0 n 00 rp 

nxVxV,£„ = -j r £(A + (,+ 1K.)^P 

= jkoV t E n + 2« m V t £J n + — l)« m — f ^ pn 

p=2 u p+1 

= j k a W tE n + 2/c m V < £ n + O (n 
and on making use of the vector identity 

V f (V • E t ) = n x VxV t £ n - 2k m V t E n 
given earlier, we arrive at the following result 

Vt(V-Ei) = jk 0 V t E n + O (n -5 ) (6.19) 

Since our ABC was derived to have a residual error of 0(n -5 ), we can replace 
jk 0 V t E n with V t (V-E<) without affecting the order of the approximation. Doing so, 
the second order ABC with a symmetric operator can be rewritten as 

(D - 2« m ) n x V xE = { 4 ^ - k 3 + D (jk 0 - rj-) + ( fj ) 2 • A«C-} E,+ 

Vx {n (V xE) n } + jj- (jk 0 + 3 Km -fL- 2 ff.^j v t (V-E t ) (6.20) 

It can be easily shown that the above boundary condition leads to a symmetric 
system of equations when incorporated into the finite element functional for surfaces 
having = « 2 . Equations (6.10) and (6.20) reduce to the boundary conditions 
derived in [49] on setting = /c 2 = 1/r which have been found to work well for 
spherical and flat boundaries [68]. 
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6.1.3 Finite element implementation 

The boundary condition outlined in equation (6.20) cannot be incorporated into 
the finite element equations without modification. As explained in Chapter 3, the 
absorbing boundary condition is implemented in the finite element system through 
the surface integral over the mesh termination surface 5 0 . 

I E • n x V xE dS = ^E-P(E)dS 

where P( E) denotes the boundary condition relating the tangential magnetic field 

to the tangential electric field on the surface. 

Let Pi(E) denote the first order absorbing boundary condition given by (6.10), 
where the subscript represents the order of the ABC. Therefore, the surface integral 
contribution for the first order ABC reduces to 

f E • Pi(E) dS = {jk a + K m )J s E-EtdS - J s E-(rj‘E t ) dS (6.21) 

Using some basic vector identities and considering that E t = -n x n x E, we deduce 
that 

[ E • Pi(E) dS = (jk a + n m ) J s (El + El) dS - J s ° (kiEI + k 2 EI) dS 

J So ° 

( 6 . 22 ) 

which is a readily implementable form of the first order ABC. However, the second 
order ABC does not simplify as easily. If P 2 (E) denotes the second order ABC given 
by (6.20), we can rewrite it in a more compact vector notation as 

P 2 (E) = a • E* + /3 • [V x (n (V xE) n }] + 7 • {V* (V-E t )} (6.23) 

where the tensors a , /3 and 7 are given by 

a = — — ^ — K g + D ( jk 0 — /ci) + «? + titi 
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+ 


|4/c^ - Kg + D ( jko - k 2 ) + «2 - KmA*} t 2 t 2 (6.24) 

(6.25) 


D - 2« m 


fi = y: — z — {titi + t 2 t 2 | — /? |titi + t 2 t 2 j 

JJ ~ lK m 

7 = — / 1 -> \ ■*" ~ ~^T 


;/:<,(£) - 2 « m ) 
-1 


— ^ — r + 3« m — 2 k 2 ^ t 2 t 2 

^K m ) V K m 


/ 


(6.26) 


jk 0 (D - 2/c m ) 

Substituting the second-order absorbing boundary condition in the surface inte- 
gral given in (6.23), we have 

/ E-P 2 (E)dS = f E-(a-E t ) dS + f E • (/3 • {fi (VxE) n }) dS 

J So * So J So 

+ [ E • {7 • V t (V-E t )} dS 

J So 

— 1\ + I 2 + 7 3 


Let us examine the integral I\. Since E < = — n x n x E, we have 

h = [ ai El+a 2 EldS (6.27) 

•J So 

after employing some simple vector identities. 

The other two integrals (/ 2 and 7 3 ) do not reduce as easily to simple, imple- 
mentable forms. They are first simplified using basic vector and tensor identities 
and then the divergence theorem is employed to eliminate one of the terms. Consid- 
ering the integrand of the second integral / 2 , we note that 

E • [/3 • (V xn</>)] = (/? • E) • (Vxn<£) 

where we have set <f> = (VxE) n . Using some additional vector identities and letting 
f3 • E = F, we get 

F • Vxn<£ = V'(#xF)|#'VxF 
= V • (<f>n x F) + <j> (V xF) n 



110 


Using the results from [64], the first term in the above identity can be further sim- 
plified to read 

V • {<j>n x F) = V, • {<{>n xF) + ^ {<f>n • (n x F)} - J {<f > n • (n x F)} 

= V, • {<f>h x F) ( 6 - 28 ) 


where V, denotes the surface gradient operator and J = k i + k 2 - The integral / 2 
can now be written as 

j 2 = [ V, • (<f>n x F) dS + / <£(V xF) n dS 

* Js 0 • /5 ® 

We can now apply the surface divergence theorem to the first term on the RHS of 
the this expression to yield 

f V, ■ {<f>h x F) dS = I <f>m ■ (n x F) dl = 0 (6.29) 

JSo Jc 

since the surface S 0 is closed. We note that m = 1 x n and \ is the unit vector along 
the edge of the surface element and C denotes the contour of integration. On the 
basis of (6.29) and considering that /3 is a simple scalar, h reduces to 

I 2 = / £(VxE ) 2 n dS (630) 

J So 

We now turn our attention to simplifying h for implementation in the finite 
element equations. Considering the integrand of I3, we have 


E • {7 • V t (V-Et)} = (7 • E) • {V t (V-Et)} 
where = V • E t . Next, setting G = 7 • E, we obtain 

G-|v^-n^}=V-(^G)-^V G-gJ^ (6.31) 
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The first term in the above identity can be written as 

V*(0G) = V,-(rJ>G) + -?-{il>G n )-J{rPG n ) 
and since dG n /dn = V • G — V • G t + JG n , the LHS of (6.31) reduces to 

G jvv>-n|^j = V, • (V>G) - V>V • G, (6.32) 

We now replace the integrand of I 3 with the expression in (6.32) and use the diver- 
gence theorem to eliminate the first term of (6.32). Specifically, 

h = f V, • dS - f ipV-G t dS 

JSo j So 

= f m • (V>G) dS- j ■ G t dS = - j ipV-GtdS 
JC JSo Js 0 

where m has been defined earlier and the contour integral vanishes since the surface 
is closed. The integral / 3 can finally be rewritten as 

h = - f (V • E f ) (V • G t ) dS (6.33) 

J So 

Using (6.27), (6.30) and (6.33), the complete surface integral term incorporating 
the conformal second order ABC reduces to 

f E ■ P 2 (E) iS = I '(a, El +a 2 E?) dS + f /9(VxE)“ dS 

^ ° J S 0 x 7 Js o 

-/ fc (V-*,){V.(>7.EU« (6.34) 

It remains to be seen whether the integrals in (6.34) lead to a symmetric system 
when incorporated into the finite element equations. With this in mind, we will 
examine three simple shapes and check whether they preserve symmetry of the finite 
element system. It will then be possible to generalize our findings to a more general 
mesh truncation boundary. 
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Let us consider the case of a sphere of radius r. Since the two principal curvatures 
of the sphere are identical (m = k 2 = 1 /r), the first order boundary condition reduces 
to the simple Sommerfeld radiation condition 


/ E-Pi(E) dS = jk 0 I (E 2 e + El)dS (6.35) 

JSo JSo 

On a spherical boundary, the second order ABC also reduces to the comparatively 
simple form: 


J s E ' ^(E) dS - + 2 j r (VxE) n 2 jfc 0 + 2 / r ( V t) 


dS 

(6.36) 


The ABC given in (6.36) is identical to the boundary condition derived in [49] for a 
spherical mesh termination surface and leads to a symmetric system of equations. 

Next, we consider the case of a planar termination boundary in which case m = 
k 2 = 0. The first order ABC then reduces to the Sommerfeld radiation condition 
and the second order ABC for a planar boundary simplifies to 


/ E ■ A( E) dS = l [jU* + ^ (VxBfi - ^ (V • E.) 2 ] dS (6.37) 


Since the planar boundary is a special case of a spherical boundary, (6.37) again 


reduces to asymmetric system of equations. 

Now we examine the situation when the mesh termination boundary is cylindrical 
in shape and of radius p. The principal curvatures of the cylindrical surface are 
then ki = l/p and « 2 = 0. Since the principal curvatures are no longer identical, 
the tensors a and 7 do not reduce to simple scalars. The first order ABC for a 
cylindrical outer boundary is given by 


J $ E-Pi(E)dS = (A + J) /*(*•+?) *"-/*) 


- / -El dS (6.38) 


IP 
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and the second order ABC gives by 

/ E-P 2 (E)dS = / (a cl El + a c2 El) dS+ f &(VxE ) 2 p dS 

j So ** So JSo 

-I '(V-E,){V •(%•£),} is (6.39) 

j So 

where a c i, ct c2 , /? c and 7 C are obtained by substituting «i = 1/p and k 2 = 0 in the 
original expressions for a, /3 and 7 . It is seen that the first order ABC given by 
(6.38) leads to a symmetric matrix for a cylindrical boundary. On the other hand, 
the second order ABC does not yield a symmetric matrix for an arbitrary choice 
of basis functions. However, the boundary condition outlined in (6.39) preserves 
symmetry on using linear edge-based elements for discretization. 

The above discussion enables us to conclude that the first order boundary condi- 
tion leads to a symmetric system for surfaces having arbitrary principal curvatures. 
However, symmetry is guaranteed for the second order ABC only when the two prin- 
cipal curvatures of the mesh termination boundary are identical, i.e., only when the 
outer boundary is limited to a planar or a spherical surface. Thus if we want to 
enclose a scatterer having arbitrary shape within a conformal outer boundary, an 
unsymmetric system of equations will have to be solved. It should, however, be 
noted that the resulting unsymmetric system will, in general, have a lesser number 
of unknowns than its symmetric counterpart. 

6.2 Applications 

In the previous section, we have discussed the derivation of absorbing boundary 
conditions which can be employed on surfaces conformal to the scattering or radiating 
structure. As a result, the mesh termination boundary can be made to enclose the 
scattering object more snugly. Consequently for arbitrary targets, we achieve a 



substantial saving in the amount of volume to be meshed between the ABC surface 
and that of the scattered This is particularly critical when the target is cylindrical 
in shape or a combination of cylindrical, doubly curved and planar surfaces as is the 
case with any real-life structure. 

In this section, we examine the performance of these boundary conditions when 
applied on conformal mesh termination surfaces. 

A. Composite cube 

For our first example, we compute the backscatter pattern of the half metal-half 


z 



Figure 6.2: Geometry of cube (a = 6 = 0.5A) consisting of a metallic section and a 
dielectric section (e r = 2 - j2), where the latter is bounded by a resistive 
surface having R = Z a . 

dielectric cube geometry shown in Chapter 3. However, instead of using a spherical 
surface to terminate the mesh, we employ the absorbing boundary condition on a 
piecewise planar surface, i.e., a cubical box placed only 0.3A from the face of the 
scatterer. The geometry is shown in Figure 6.2 and needed only 30,000 unknowns 
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for discretization. This is in stark contrast to the 40,000 unknown system which 
resulted when the geometry was enclosed in a spherical termination boundary. The 
decrease in the unknown count is even more dramatic as we go to larger scatterers. 
In Figure 6.3, we plot the backscatter pattern in the x — z plane (E' z nc = 0 polar- 



Figure 6.3: RCS pattern in the x — z plane for the composite cube shown earlier. The 
solid curve is the FEMATS pattern and the black dots are MoM data for 
the E' z nc — 0 polarization. Mesh termination is piecewise planar. 

ization) for the metal-dielectric cube geometry given in Figure 6.2 and compare the 
computed values with data obtained from a traditional method of moments (MoM) 
code [53]. The dielectric-filled section has unit permeability and a relative permit- 
tivity of 2 - j 2. The agreement with reference data is seen to be excellent; it can 
therefore be concluded that accuracy of the fax-field values has not been affected by 
a different mesh termination scheme. In fact, we have obtained results of comparable 
accuracy with only 75% of the computing resources than were necessary before. This 
observation will be made by the reader again and again in the following pages as the 
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full capability of these conformal ABCs is demonstrated. 

B. Inlets 

In our next example, we compute the scattering from perfectly conducting inlets. The 
aperture of an inlet usually has a large radar cross-section around normal incidence: 
therefore, a good understanding of its scattering characteristics is critical if measures 
need to be taken for reducing its echo-area. An accurate computer simulation of 
such a geometry provides a cost-effective and ready way of allowing the designer 
to experiment with complex material fillings to achieve satisfactory results. All our 
validations are carried out for empty inlets due to lack of reference data for more 
complicated structures. 



Figure 6.4: Backscatter pattern of a metallic rectangular inlet (1A x 1A x 1.5 A) for 
HH polarization. Black dots indicate computed values and the solid line 
represents measured data [1]. Mesh termination surface is spherical. 

The geometry of interest is a perfectly conducting rectangular inlet, with dimen- 
sions 1A x 1A x 1.5A. For the plots shown in Figures 6.4 and 6.5, we have enclosed 

o 
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Figure 6.5: Backscatter pattern of a metallic rectangular inlet (1A x 1A x 1.5A) for 
VV polarization. Black dots indicate computed values and the solid line 
represents measured data [1]. Mesh termination surface is spherical. 


the target within a sphere of radius 1.35A, which is only about .35A from the farthest 
edge of the scatterer. This resulted in a system of 224,476 unknowns and converged 
in an average of 785 seconds per incidence angle on the 56 processor KSR1. The 
computed values from our code agrees very well with measured data for both HH 
and VV polarizations. As can be seen from the above discussion, we have obtained 
our solution using significant computing resources and time. 

Our next step is to use the conformal mesh termination scheme formulated in the 
previous section and utilized in Example A. Therefore, instead of using a spherical 
mesh truncation surface, we terminate the mesh with a rectangular box placed only 
0.35A away from the scatterer (see inset of Figure 6.6). The problem size reduces dra- 
matically to 145,000 unknowns, a 35% reduction over the spherical mesh termination 
scheme. The convergence time for each excitation vector is about 220 seconds, less 



than 4 minutes, when run on all 56 processors of the KSR1. The computed values 
are again compared with measured data for both polarizations in Figures 6.6 and 
6.7; the agreement is excellent, albeit a bit worse than the spherical case. However, 
this fact is overshadowed by the fact that we have reduced the problem size by more 
than a third and computing time by about a fourth. 



Figure 6.6: Backscatter pattern of a metallic rectangular inlet (1A x 1A x 1.5A) for 
HH polarization. Black dots indicate computed values and the solid 
line represents measured data [1]. Mesh termination surface is piecewise 
planar. 

We then considered the problem of scattering from a perfectly conducting cylin- 
drical inlet. Even though integral equation codes are more efficient for such bodies 
of revolution, our primary concern in this test was to examine the performance of 
the conformal absorbing boundary conditions that we derived earlier. The target is 
a perfectly conducting cylindrical inlet having a diameter of 1.25 A and a height of 
1.875A. We first used a rectangular outer boundary, placed .45A from the farthest 
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Figure 6.7: Backscatter pattern of a metallic rectangular inlet (1A x 1A x 1.5A) for 
VV polarization. Black dots indicate computed values and the solid 
line represents measured data [1]. Mesh termination surface is piecewise 
planar. 

edge of the target, to enclose the scatterer. The radar cross-section was then com- 
puted for a ^-polarized incident wave in the yz-plane and compared with measured 
data (Figure 6.8). The agreement was found to be quite good for all lobes except 
the third. We expect the results to improve on moving the outer boundary farther 
away. The backscatter echo-area computed for the same geometry by Shankar [69] 
using the finite difference-time domain method with the absorbing boundary placed 
a few wavelengths from the scattering structure, agrees with the computed results 
remarkably well for all incidence angles. 

Next, we used a truly conformal termination scheme by using a cylindrical sur- 
face for mesh truncation. It should be noted that this is the first instance of a 
non-spherical surface (i.e., a surface having different principal curvatures) being ap- 
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Figure 6.8: Backscatter pattern of a perfectly conducting cylindrical inlet (diame- 
ter 1.25A, height=1.875A) for HH polarization. The solid line indicates 
measured data [2] and the black dots indicate computed data. Mesh 
termination surface is a rectangular box 

plied to terminate a finite element mesh for solving open problems. The cylindrical 
outer boundary was placed about 0.45A from the target and RCS computations were 
carried out for a <f> polarized incident wave and compared with measured data [2] and 
with a body of revolution code [3] (Figure 6.9). As can be observed from Figures 6.8 
and 6.9, the far-field results for a cylindrical termination and a rectangular termina- 
tion do not differ significantly. However, the savings in computational cost is quite 
impressive. The cylindrical mesh termination has only 144,392 unknowns compared 
to the 191,788 unknowns for a rectangular truncation scheme. A spherical mesh 
termination would have swelled to about 265,000 unknowns, sampling density and 
outer boundary distance remaining the same. Thus we have reduced the problem size 
by about 45% and computation time by a similar, if not greater, amount by using 
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a conformal mesh termination scheme. The savings in computational resources is 
quite significant even when we compare the rectangular and cylindrical termination 
schemes - a 25% reduction in problem size and a similar decrease in computation 
time. In Figure 6.10, we plot the backscatter pattern for the same cylindrical inlet 



Figure 6.9: Backscatter pattern of a perfectly conducting cylindrical inlet (diame- 
ter= 1.25A, height=1.875A) for HH polarization. Black dots indicate 
computed values, the solid line represents measured data [2] and the 
dotted line is body of revolution data [3]. Mesh termination surface is a 
circular cylinder. 

in which the incident wave is 0 polarized. The agreement is seen to be decent for the 
entire range of incident angles. 

C. Lossy foam cylinder with embedded wires 

The next geometry to be considered was a lossy foam (e r = 1.05 — j0.2) cylinder 
having a radius of 1A and a height of 3.5A with 0.5A long perfectly conducting wires 
embedded in it. The wires are spaced 0.5A apart from each other and have a diameter 
of 0.01 A. This is an interesting problem for two reasons: first, the orientation of the 
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Observation Angle 0 a , deg. 

Figure 6.10: Backscatter pattern of a perfectly conducting cylindrical inlet (diam- 
eter 1.25A, height=1.875A) for W polarization. Black dots indicate 
computed values, the solid line is measured data [2] and the dotted 
line represents reference data from a body of revolution code [3]. Mesh 
termination surface is a circular cylinder. 

embedded wires makes it a difficult geometry for other methodologies to handle; 
second, the problem is very large since the cylinder has a volume of 11 A 3 and a 
surface area of 28.27A 2 . 

As a first case, we consider the wires to be aligned along the axis of the cylinder 
(the Z axis in this case) and compute the resulting backscatter pattern in the (Figure 
6.11). The cylindrical mesh termination boundary was placed 0.45A from the flat 
and curved surfaces of the foam cylinder. The resulting system of equations had 
437,064 unknowns but needed only an average of 3050 iterations to converge to the 
desired answer. Thus each angle of incidence took a little more than 12.5 minutes 
to compute on a 56-processor KSR1 massively parallel architecture. The impressive 
convergence rate is due to the low contrast and the loss in the dielectric medium as 
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Figure 6.11: Backscatter pattern of a lossy foam cylinder with three perfectly con- 
ducting wires embedded along the axis. The incident electric field is 
oriented parallel to the axis of the cylinder. Mesh termination surface 
is a circular cylinder. 

well as due to the presence of the metallic wires. 

In the second case, the middle wire is offset 0.25A in the negative Af-direction 
from the cylinder axis. The number of unknowns and the time taken for convergence 
is comparable to the earlier case. The effect of the offset wire on the backscatter 
pattern can be studied in Figure 6.12. There is a slight asymmetry in the side lobes 
of the resulting backscatter pattern but, as can be expected, the effect is very small. 
D. Perfectly conducting plate 

The motivation for testing the FEMATS code on the perfectly conducting plate was 
two-fold. It is usually very difficult to model the scattering from the edges of the 
plate even using integral equation methods. Therefore, we wanted to carry out some 
tests to see how the code would behave at edge-on incidence. Secondly, we wanted 
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Observation Angle 6 0 , deg. 


Figure 6.12: Backscatter pattern of a lossy foam cylinder with the middle wire offse 
0.25A from the cylinder axis. The incident electric field is oriented 
parallel to the axis of the cylinder. Mesh termination surface is a circular 
cylinder. 

to examine the performance of termination boundaries of esoteric shapes. The first 
choice was to enclose the plate in a rectangular box. The second choice was to use 
a box with half cylinders attached to the faces normal to the plane of the plate 
- the reasoning being that since the edge of the plate behaves like a line source 
and scatters cylindrical waves, a cylindrical mesh termination was most suitable for 
wave absorption. It should be noted that both mesh termination schemes require 
approximately the same number of unknowns; the superiority of one over the other 
was thus decided only on the basis of computed backscatter values. 

Our test case is a 3.5A x 2A perfectly conducting rectangular plate. In Figure 
6.13, we plot the backscatter pattern for the 66 polarization in the xz plane, i.e., over 
the long side of the plate. The computed values compare very well with reference 
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Figure 6.13: Backscatter pattern {<jm) of a 3.5A x 2A perfectly conducting plate in 
the xz plane. The white dots indicate box termination; the black dots 
represent a combined box-cylinder termination. 

data; however, the code does not pick up the sharp null at 0 o = 45° and the two mesh 
termination schemes perform as well, although a slight improvement is noticeable for 
the box-cylinder termination. 

Next, we plot the backscatter pattern of the same geometry for the <j><f> polarization 
over the long side of the plate in Figure 6.14. Again, the agreement with reference 
data is quite good. However, the backscatter echo-area at edge-on incidence is not 
calculated accurately. 

In the next figure, we compute the RCS of the conducting plate in the yz plane, 
i.e., over its short side, for the <f><j> polarization. The backscatter echo-area for edge-on 
incidence is picked up very well for a rectangular-cylindrical termination whereas a 
rectangular truncation scheme gives completely incorrect results. These two schemes 
have approximately the same storage requirement; in fact, the box-cylinder combi- 
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Observation Angle ©o- deg. 

Figure 6.14: Backscatter pattern (cr^) of a 3.5A x 2A perfectly conducting plate in 
the xz plane. The white dots indicate box termination; the black dots 
represent a combined box-cylinder termination. 

nation yields a slightly smaller system of equation. This example truly illustrates 
the power of a conformal truncation scheme composed of simple shapes: not only 
axe the results far more accurate but even the storage requirement is slightly less. 

In all the above simulations, the boundary was terminated at 0.35A from the flat 
face of the plate and 0.5A from the edges of the plate. In order to test the accuracy 
of the ABC method as a function of mesh termination distance, we computed the 
backscatter patterns from the edges of the plate with increasing mesh termination 
distance. Figure 6.16 shows that the backscatter values from the plate edges slowly 
take the shape of the reference data as the mesh truncation distance is increased. 

E. Glass plate 

In the final example, we present the results for one of the most challenging problems 
solved by the FEMATS method. The target is a simple rectangular glass slab 1.75A 
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Figure 6.15: Backscatter pattern (cr^) of a 3.5A x 2A perfectly conducting plate in 
the yz plane. The white dots indicate box termination; the black dots 
represent a combined box-cylinder termination. 

long, 1A wide and .125A thick. The relative permeability of glass is taken to be 
3 — j. 09. The backscatter pattern (tree) is sought for the 0 = 80° cut. This is an 
extremely difficult problem for any numerical method to handle since the incident 
field is almost edge-on to the dielectric slab, causing the scattered field to have strong 
higher order components which decay appreciably only at large distances from the 
scatterer. 

In our first attempt, we enclosed the glass plate in a flat box placed .45A from it. 
Though the computed results were accurate for some angles, they departed signifi- 
cantly from the reference data [69] for other incidences. The results did not show a 
significant improvement on shifting the outer boundary 0.5A away from the scatterer 
while maintaining its shape. The next step was to modify the shape of the outer 
boundary such that the flat box had half cylinders attached to the faces normal to 
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Figure 6.16: Backscatter pattern ( a of a 3.5A x 2A perfectly conducting plate in 
the yz plane. The numbers in the legend indicate mesh termination 
distance from the plate edges. 

the plane of the slab - the reasoning was the same as that for the perfectly con- 
ducting rectangular plates mentioned in an earlier section. Further, this termination 
scheme results in an outer boundary with no sharp edges. The free space between 
two cylinders with their axes perpendicular to each other is filled with a quarter 
sphere having the same radius as the cylinders (see Figure 6.17). As can be observed 
in Figure 6.18, the computed results agree quite closely with the reference data for 
most of the incident angles. The problem was solved with only 155,000 unknowns 
and needed an average of 2700 iterations to converge to the specified tolerance. The 
slow convergence is typical of geometries that are composed of dielectrics since the 
linear system of equations becomes indefinite due to non-unit values of the relative 
permittivity. 
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Figure 6.17: An eighth of a glass plate enclosed inside a smooth mesh termination 
boundary composed of flat planes and cylindrical and spherical sections. 

6.3 Conclusion 

In the previous section, we have used our FEMATS methodology to compute scat- 
tering patterns from large, three dimensional geometries having arbitrary shapes and 
complicated configurations and with material inhomogeneities. It has been shown 
that the solution technique does indeed live upto its promise of delivering accurate 
results with the expenditure of minimal computer resources. Very large problems 
can be solved in real time with a fraction of the resources that existing numerical 
electromagnetics codes require. 

There are basically two parameters that govern the accuracy of the final result. 
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Figure 6.18: Backscatter pattern {(Tee) of a glass plate (1.75A x 1A x .125A) having 
a relative permittivity of (3 — j.09) for the 6 = 80° cut. 

The first one is the mesh termination technique. Though lower order absorbing 
boundary conditions are simpler to implement, a significant penalty in computational 
resources has to be paid for getting the same accuracy as the higher order ABCs. The 
reliability and the order of the ABC is thus crucial to the computation process. The 
second parameter is the shape of the mesh termination boundary and its distance 
from the scatterer. As shown in the previous sections, significantly better results 
can be obtained by using smooth termination boundaries in place of flat boxes with 
sharp corners. The distance of the termination boundary from the scattering body is 
also important in obtaining accurate results. We have determined that a distance of 
.45A usually gives reliable results for large three dimensional geometries. For small 
problems, distances of .3A - .35A are sufficient to give accurate far-field values. In 
order to obtain reliable near-field values such as the input impedance of an antenna, 
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the mesh must be terminated farther away from the target. 



CHAPTER VII 


Conclusions 


This thesis is one of the first works known to the author to tackle the problem 
of three dimensional scattering using finite elements and ABCs. It started out a a an 
investigation into the methodology mentioned above, given its extremely attractive 
features for solving large problems. The O(N) storage requirement and the iterative 
equation solver are essentially the keys to this technique, since they allow the solution 
of extremely large, realistic problems with minimal computational overhead. As the 
technology progresses to higher frequencies and longer electrical dimensions, the 
superiority of this solution methodology over integral equation or hybrid techniques 
will become even more apparent. 

The opening chapter outlined the motivation in choosing this solution technique 
over other traditional methods. In Chapter 2, we described the basic concepts of 
electromagnetics and finite elements. They were by no means complete or exhaustive 
and the interested reader is referred to several excellent texts for reference [10, 13, 14]. 
Chapter 3 provided the rationale for employing edge basis functions for discretizing 
electromagnetic field variables in three dimensional computation. A full review of 
nodal and edge bases of various orders for two and three dimensions was also carried 
out. In Chapter 4, the basic formulation of the solution technique was presented along 
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with code validation for a large number of small and composite geometries. Since 
the methodology was found to perform extremely well for small complex geometries 
with inhomogeneities, we considered extending it to compute scattering from very 
large problems. In order to achieve this goal, the first step was to make the finite 
element code computationally efficient. Chapter 5 discussed the various optimization 
techniques that were carried out to make the code as computationally efficient as 
possible on a wide class of vector and parallel machines. Since the mesh termination 
condition used till now were valid only for spherical or flat terminations, the next step 
was to derive more efficient boundary conditions that would yield accurate results 
even with reduced computational resources. In Chapter 6, new ABCs were derived 
which are enforceable on mesh termination boundaries conformal to the surface of the 
target and result in drastic reductions in the number of unknowns and hence solution 
time. Thus, it can be stated that this thesis has achieved its objective of showing 
the efficacy and the accuracy of computing three dimensional scattering from large, 
composite and complex-shaped structures using finite elements in conjunction with 
the ABC method of mesh termination. 

Any research, however, by its very nature generates more questions than it an- 
swers. The work outlined in the previous chapters is no exception. The author has 
sought to provide answers to the more immediate questions but many more need to 
be examined for making this methodology robust and viable in commercial applica- 
tions. Tests on a large variety of complicated problems have produced encouraging 
results and exhibited the versatility of this method. Future research into this method 
must inevitably focus on the following aspects: 

• iterative refinement of the solution through a posteriori error estimation and the 
use of hierarchical basis functions. Hierarchical basis functions use p refinement 
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instead of h refinement to adaptively correct the solution, maintaining the 
coarse mesh throughout. 

• further reduction in the number of unknowns through the use of 

1. mixed node-based and edge-based elements. Since node-based elements 
usually have half as many unknowns as edge-based elements, nodal basis 
can be employed in free-space regions to reduce unknown count. 

2. derivation of higher order boundary conditions. This would allow the 
mesh termination boundary to be placed even closer to the target, thus 
reducing the number of unknowns dramatically - the possible asymmetry 
of the resulting equation system is an acceptable tradeoff. 

• more robust, geometry- dependent, iteratively refined numerical mesh termi- 
nation conditions that would guarantee the accuracy of the final solution. In 
most cases, it is the mesh termination condition and its distance from the 
scatterer which determine the accuracy of the computation. Numerical, itera- 
tively refined ABCs could theoretically be applied very close to the scattering 
or radiating target with excellent accuracy. 

• extension of the formulation to antenna radiation and electromagnetic interfer- 
ence(EMI) problems. The formulation for the antenna problem must include 
efficient source modeling and reliable simulation of the near-field for impedance 
calculations. EMI phenomena critical to electronic packaging can be predicted 
efficiently by the technique through proper modeling of the circuit or the device 
to be shielded. 

• more efficient mesh generation packages dedicated to electromagnetics. 
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Most of the extensions mentioned above are not trivial and each could form the 
core of another doctoral dissertation. However, the efficient realization of even a few 
of these topics would result in a very powerful technique for computing electromag- 
netic radiation or scattering from arbitrary three dimensional structures reliably and 
efficiently. 
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APPENDIX A 


Derivation of matrix elements 


The derivation of the matrix elements in ( 4.7) amounts to evaluating the integrals 
in ( 4.4) and ( 4.5). Therefore, from ( 4.4), we have 

i r (v * ■ (v * w ;> = r-ft-gjK (A.i) 

since V x W* = 2g,-. The evaluation of the integral in ( 4.5) is more cumbersome. 
Substituting into ( 4.5) the basis functions defined in ( 4.11), we obtain 

J Ve W, e .W®du = e r J v {(fi-fj) + (r.D) + ( gi x r).(g,- x r)} dv (A.2) 

= e r {h + I2 + Id) 

where 


D = (f. X gi) + (fi X g i) 


I fi.fidv 
Jv c 

(A.3) 

f r.D dv 

Jv e 

(A.4) 

J (g. x r).(gj x r )dv 

(A.5) 



Since f is a constant vector, 7i reduces to 


/, = f..f J v e 


(A.6) 


Since 


4 

X = ^2 Li x i 

t=l 

4 

y = Y, L w 

t-l 

z = j^L iZi 

i= 1 

where Li are the shape functions for the tetrahedral element and x», y«, 2,(1 1, • • • , 4) 

denote the x, y and 2 co-ordinates of the vertices of the tetrahedral element. Using the 
standard formula for volume integration within a tetrahedral element and simplifying, 
we have 


h = 



+ ArI> + ^X> 


«=1 


i=l 


(A.7) 


where D m is the mth component of D. The evaluation of I 3 can be simplified by the 
use of basic vector identities. Therefore, 


h 


g,-g; / |r | 2 dv — l (g,.r)( gj .r)dt; 

JV e 

{9iy9jy + 9iz9jz) J v x * dv + (9ix9jx + 9iz9jz) y^ dv + ( 9ix9jx + 9iy9jy) z dv 
- ( 9ix9jy + 9jx9iy) J v x V dv - ( 9ix9jz + 9jx9iz) J V ' zxdv - (9iy9iz + 9jy9iz ) J V ' V zdv 

(A.8) 


where g im represents the mth component of the vector g,. Each of the volume 
integrals in the above equation can be easily evaluated analytically and the result 
expressed in the following general form: 
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aia m dv 


v e r 4 4 4 

on Z. '/ "f" ^ ^ a li ^ ' ®mi 
L»=l .'= 1 •'= 1 


(A. 9) 


where /, m = 1, • • • , 3 and a x represents the variable x, a 2 stands for the variable y 
and 03 denotes the variable z. 


APPENDIX B 


Derivation of some vector identities 


The curl of a vector in the Dupin coordinate system is given by 

VxE = V r xE + nx^ (B.l) 

where Vy x E is called the surface curl involving only the tangential derivatives and 
is defined as 


V T x E = -n x VE„ + t 2 /ciE tl - i lK 2 E t3 + nV • (E x n) (B.2) 

In the above equation, Ki and k 2 denote the principal curvatures of the surface 
under consideration, E u ,E t a are the tangential components and E n is the normal 

component of the vector E on the surface. 

We are interested in the evaluation of the three vector identities given in Chapter 
5. Let us consider simplifying the tangential components of the curl of a vector, E 
in this case. Using the definition of the curl given above, we have 


nxVxE = -(n x n x V)E„ — — K 2 -Ei 2 t 2 + 


. L 0E\ 

n X \ n X dn j 


= V tE n 


(it 1 + K,En ) {i + + K2E ‘ 2 ) u 1 


= V t E n + n x VxE t 


(B.3) 


where -(n x n x V) = V<. The first vector identity is, therefore, easily proved. 
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Next, we will prove the second of the three identities. We start with the term 
n x V xV ( £ n and simplify it using the definition of the curl of a vector given above. 


n x V xV,£ n 


n x Vx 


V£ n - n 


dE n 


dn 


— n x Vxn 


(n x n x V) 


dEn 

dn 

dE n 


dn 



(B.4) 


Since V • E = V • E t + (V • n)£ n + we can simplify the above relation even 
further by substituting the appropriate expression for the normal derivative of the 
normal component of the electric field and using the fact that the electric field is 
divergence-free in a source-free region. 


n x V xV|£„ = V t (V ■ E t ) + (V • n)V t £„ 

= V 1 (V-E t ) + 2« m V 1 E n (B.5) 

where K m = («i + «2)/2 is the mean curvature. 

The proof of the third identity is more complicated since it involves two curl 
operations on the electric field. We first need to switch the positions of the outermost 
nx and the Vx operators to arrive at a simplified form of the rather complex 
expression. Therefore, 


nxVx(nxVxE) = Vx {n x n x VxE} — nV t • (n x VxE) 

- A/c {(V xE) t2 t x + (V xE) ti t 2 } 

= — Vx {VxE — n(VxE) n } — nV ( • (n x VxE) 

-A/c {(VxE) tj tx + (VxE) ti t 2 } (B.6) 

Now we use the fact that the electric field satisfies the wave equation to reduce the 



expression even further. 


nxVx(nxVxE) = -k]E + Vx {n(VxE)„} + k 2 0 hE n 

—Ak {(V xE) tj ti + (V xE) (i t 2 } 

= Vx {n(V xE)„} - k 2 0 E t - Ak {(VxE) tj t x + (VxE) fj t 2 } 

(B.7) 


where Ak = K\ — k 2 . 

Thus, we have shown that all three identities hold as long as the vector, E in this 
case, is divergenceless and satisfies the vector wave equation. 
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